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ABSTRACT 


The  Simplex  algorithm,  developed  by  George  B.  Dantzig  in  1947  represents  a 
quantum  leap  in  the  ability  of  applied  scientists  to  solve  complicated  linear  optimiza¬ 
tion  problems.  Subsequently,  its  utility  in  solving  finite  models,  including  applications 
in  transportation,  production  planning,  and  scheduling,  have  made  the  algorithm  am 
indispensible  tool  to  many  operations  i .  ^uci.  ?. 

This  thesis  is  primarily  an  exploration  of  the  simplex  algorithm,  and  a  dis¬ 
cussion  of  the  utility  of  the  algorithm  in  unconventional  optimization  problems.  The 
mathematical  theory  upon  which  the  algorithm  is  based  and  a  general  description 
of  the  algorithm  are  presented.  The  reader  is  assumed  to  have  little  exposure  to 
convexity,  duality,  or  the  Simplex  algorithm  itself.  More  important  to  the  thesis  are 
the  examples  that  accompany  the  discussion  of  the  Simplex  algorithm.  Herein  are  a 
variety  of  unusual  applications  for  the  algorithm,  including  applications  in  infinite  di¬ 
mensional  vector  spaces,  uniform  approximation,  and  computer  assisted  tomographic 
image  reconstruction.  These  examples  serve  both  to  facilitate  a  better  understanding 
of  the  algorithm,  and  to  present  it  in  unusual  settings. 
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OVERVIEW  OF  THESIS 


A.  THESIS  OBJECTIVES 

The  development  of  the  S  nplex  algorithm  by  Dantzig  in  middle  of  this  century 
represents  a  milestone  in  linear  optimization  techniques.  The  impact  of  Dantzig’s 
work  is  profound.  Results  of  his  work  include  the  revival  or  introduction  of  a  number 
of  mathematical  disciplines,  including  convexity  and  duality  theories.  Applications 
for  the  Simplex  algorithm,  and  the  accompanying  refinements,  are  vast,  and  many 
continue  to  explore  new  and  diverse  applications. 

The  majority  of  the  research  on  linear  optimization  problems  is  taking  place 
in  various  fields  of  Operations  Research.  Of  course,  the  Simplex  algorithm  itself 
is  particularly  well  suited  to  problems  in  that  particular  discipline,  rendering  rapid 
solutions  to  production  planning  models,  transportation  problems,  and  a  variety  of 
other  “real  world”  applications.  A  great  deal  of  work  was  done  up  to  the  early 
1970’s  in  attempts  to  mold  the  Simplex  algorithm  into  an  engineering  and  theoretical 
mathematical  tool.  With  the  advent  of  more  sophisticated  computer  hardware  and 
software,  there  may  be  utility  in  reconsidering  the  role  of  the  Simplex  algorithm  in 
control,  approximation,  and  other  infinite  dimensional  applications. 

This  document  is  intended  to  serve  two  main  purposes.  First,  the  thesis  is 
intended  to  serve  as  an  introduction  to  linear  optimization  and  to  the  Simplex  algo¬ 
rithm,  or  a  theoretical  review  for  readers  already  familiar  with  these  topics.  Second, 
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It  is  intended  to  present  less  traditional  problems  in  a  manner  that  is  suitable  for 
solution  with  the  Simplex  algorithm. 


B.  THESIS  FORMAT 

The  thesis  is  broken  into  three  parts.  The  first  part,  consisting  of  the  first 
two  chapters,  is  devoted  to  describing  sample  problems  with  which  the  theory  of  the 
Simplex  algorithm  is  illustrated.  Also  image  reconstruction  is  introduced,  a  problem 
whose  solution  by  the  Simplex  algorithm  highlights  the  thesis.  These  examples  are 
more  fully  developed  in  the  latter  sections. 

The  first  example  is  particularly  unusual,  as  we  find  an  orthogonal  basis  of 
the  infinite  dimensional  vector  space  X3[0, 1].  To  the  author’s  knowledge,  this  is  the 
first  attempt  to  use  the  Simplex  algorithm  in  this  capacity.  The  formulations  that 
result  from  this  problem  are  particularly  easy  to  understand,  and  lend  a  great  deal 
of  understanding  to  concepts  underlying  the  Simplex  algorithm. 

The  second  example  may  be  found  infrequently  in  literature  on  linear  opti¬ 
mization.  We  seek  the  best  approximation  to  the  exponential  function  over  a  closed 
interval  in  the  uniform  norm  sense.  That  is,  we  formulate  a  uniform  approximation 
problem  as  a  linear  optimization  problem.  The  formulation  is  used  primarily  in  the 
discussion  of  duality. 

The  final  example  is  again  a  novel  one.  We  formulate  the  problem  of  computer 
assisted  tomographic  (CAT)  image  reconstruction  as  a  linear  optimization  problem, 
and  solve  a  small  sample  problem  with  the  Simplex  algorithm. 
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The  second  portion,  consisting  of  Chapters  IV  and  V,  introduces  the  machinery 
behind  the  Simplex  algorithm,  culminating  with  a  brief  introduction  to  the  algorithm 
itself.  Chapter  IV  is  am  exploration  of  convexity,  both  as  it  pertains  to  sets  and 
functions.  The  major  emphasis  of  the  chapter  is  on  convex  subsets  of  7Zn.  Chapter  V 
builds  on  the  convexity  results  as  they  pertain  to  duality.  Fundamental  concepts  of 
duality  are  presented  in  this  chapter,  and  it  concludes  with  a  generic  description  of 
the  algorithm. 

The  thesis  concludes  with  the  formulation  of  the  image  reconstruction  problem 
as  a  linear  optimization  problem  in  the  general  case.  The  first  portion  of  the  chapter 
is  devoted  to  the  formulation,  followed  by  the  statement  of  the  dual  problem.  Finally, 
a  sample  problem  is  solved,  and  some  analysis  of  the  appropriateness  of  the  Simplex 
algorithm  as  a  solution  tool  for  this  particular  problem  is  offered. 
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II 


PRELIMINARIES 


A.  OVERVIEW 

We  devote  this  chapter  to  the  preliminaries  of  linear  optimization.  We  begin 
by  defining  three  very  different  examples,  which  we  develop  as  a  means  to  explore 
linear  optimization  methods.  We  then  define  the  optimization  problem  in  general, 
and  the  linear  optimization  problem  specifically.  We  close  with  a  synopsis  of  the 
assumptions  that  characterize  the  linear  optimization  problem. 

B.  FIRST  EXAMPLES 

This  thesis  extensively  discusses  three  examples.  We  begin  by  stating  two  of 
our  three  examples  to  which  we  refer  throughout  the  thesis.  Because  of  its  complexity 
and  importance  to  this  work,  the  third  example  is  treated  separately. 

1.  Example  1:  Generation  of  an  Orthogonal  Basis  for 

L\ 0, 1] 

Our  first  example  is  one  of  importance  in  many  areas  of  approximation. 
We  wish  to  find  some  orthogonal  basis  for  an  infinite  dimensional  vector  space.  The 
utility  of  such  bases  may  be  found  in  any  elementary  linear  algebra  or  applied  math¬ 
ematics  text.  The  interested  reader  is  referred  to  [Ref.  1].  The  specific  vector  space 
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with  which  we  axe  concerned  is  the  space  of  functions  defined  by 

£J[0,1]  «  1/  :  ll/ll  =  (J'  /(x)!*)i  <  00  J. 

We  note  that  the  above  norm  is  induced  by  the  inner  product, 

< f,9 )  =  [  f(x)g(x) dx • 

Jo 

That  is, 

ii/n  =  JUJ)- 

In  particular,  we  seek  an  orthogonad  polynomial  basis,  and  derive  an 
optimization  technique  to  find  a  polynomial  pn,  of  order  n,  when  we  are  given  a 
set  of  orthogonal  polynomials,  po,Pi,--  ,p»-i .  Recursive  application  of  a  method  for 
generating  pn  leads  to  a  complete  set  of  basis  polynomials.  The  polynomial  basis  is 
of  particular  importance,  as  the  Weierstrass  Approximation  Theorem  assures  us  that 
any  continuous  function  /,  defined  on  [0,1],  may  be  approximated  arbitrarily  well 
with  polynomials  [Ref.  2]. 

There  are  a  number  of  existing  techniques  for  the  generation  of  or¬ 
thogonal  polynomials.  For  example,  the  Gram-Schmidt  algorithm  may  be  applied  to 
the  sequence  {l,x,x2, . . .  ,xn, . . .}.  Another  approach  involves  solving  a  three-term 
recurrence  that  generates  the  polynomials.  We  consider  an  optimization  approach, 
in  which  we  formulate  an  optimization  problem  whose  solution  gives  us  pn-  It  is  an 
approach  that  is  suitable  for  inductive  iteration. 
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2.  Example  2:  Uniform  Approximation  of  theExpo- 
nential  Function 

The  second  example  is  a  specific  problem  in  uniform  approximation. 
We  seek  the  linear  combination  of  polynomials  on  the  interval  [0,3]  that  best  ap¬ 
proximates  the  exponential  function  in  the  uniform  norm  sense.  The  problem,  con¬ 
sequently,  is  to  find  the  coefficients  a,,  that  minimize  the  expression 

max  |  f(t)  -  e*  |, 

tc[0,3]  '  W  ' 

n 

where  f(t)  —  ^  cgi*. 

«=o 

We  consider  specific  cases  of  this  example.  That  is,  we  seek  the  polynomial  for  some 
fixed  degree,  n,  that  best  approximates  the  exponential  function.  Note  that  the 
uniform  approximation  problem  is  fundamentally  an  optimization  problem.  The  use 
of  the  Simplex  algorithm  to  solve  the  problem  is,  however,  unusual. 

C.  EXAMPLE  3:  THE  IMAGE  RECONSTRUCTION 
PROBLEM 

The  third  example  is  the  image  reconstruction  problem.  As  with  the  first  two 
examples,  there  are  many  existing  techniques  for  solving  this  problem.  Unlike  the 
others,  however,  this  is  an  active  area  of  modern  research,  and  the  best  methods 
of  solution  may  yet  be  unknown.  The  reader  is  referred  to  [Ref.  3]  for  a  thorough 
treatment  of  the  problem,  and  to  [Ref.  4]  and  [Ref.  5]  for  an  introduction  to  some 
recently  developed  solution  techniques. 
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Suppose  a  neurosurgeon  wishes  to  rule  out  the  possibility  that  a  patient,  Fred, 
suffers  from  a  brain  tumor.  Further,  the  physician  opts  to  make  use  of  the  CAT 
(Computer  Aided  Tomography)  scan  device,  and  examine  the  inside  of  Fred’s  head 
without  exploratory  surgery. 

The  CAT  scan  machine  works  by  projecting  a  finite  number  of  X-rays  of  known 
intensity  into  tne  patient’s  head  from  a  finite  number  of  positions.  The  intensity  of 
the  X-rays  upon  leaving  Fred’s  head  is  measured.  The  intensity  of  the  emergent  X-ray 
depends  essentially  on  the  density  of  Fred’s  head  over  the  locations  through  which 
the  X-ray  passes.  Having  collected  data  from  a  number  of  X-rays,  the  gathered  data 
are  processed,  forming  a  model  of  the  density  of  Fred’s  head.  That  is,  the  processing 
of  the  data  results  in  the  construction  of  an  image,  and  presumably,  an  image  that 
closely  corresponds  to  the  interior  of  the  Fred’s  head.  This  data  processing,  in  this 
example,  constitutes  solving  the  image  reconstruction  problem. 

1.  X-Ray  Computed  Tomography 

Understanding  the  methods  of  reconstruction  requires  that  we  know  the 
process  by  which  the  data  for  reconstruction  are  obtained.  We  begin  with  a  basic 
discussion  of  the  manner  in  which  an  X-ray  moves  through  an  object  of  homogeneous 
density,  then  derive  the  manner  in  which  it  moves  through  more  complicated  media. 

It  has  been  shown  empirically  that  the  fractional  decrease  in  beam 
intensity  of  a  narrow  beam  of  X-ray  photons  passing  through  a  homogeneous  material 
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Figure  1.  An  X-Ray  Over  a  Homogeneous  Object:  /o  =  Input  Intensity.  /  =  Emer¬ 
gent  Intensity,  p  =  Density. 


is  given  by  the  relationship  ([Ref.  6]) 

L  -  ,-<***) 

r  —  c  i 

*0 

where  70  is  the  X-ray  input  intensity,  and  1  is  the  observed  intensity  after  the  ray 
passes  a  distance  Ax  through  the  material.  See  Figure  1.  The  parameter  p  is  de¬ 
termined  by  the  density  of  the  material.1  For  two  media,  the  fractional  decrease  is, 
predictably, 

J_  _  »-{*>(£*! )+*a(A*a)) 

I  ~  ’ 

to 

where  Ax;  denotes  the  distance  the  X-ray  travels  through  the  ith  medium. 


lp  also  depends,  to  a  leaser  extent,  on  a  number  of  other  factors,  including  the  nuclear  composition 
characterised  by  the  atomic  number  Z,  a  function  of  tb*  mogeneous  material.  [Ref.  3]  pertains. 
For  the  purposes  of  this  paper,  the  effects  of  other  paraii  lers  are  assumed  to  be  nil. 


Let  us  partition  the  media  through  which  the  narrow  beam  travels  into 
n  homogeneous  segments.  Denote  the  density  over  a  single  segment  by  p(x).  The 
decrease  in  this  case  is  expressed  by 

=  exp  ^-^p(x,)Ax^  .  (Il  l) 

Letting  n  =>  oo,  and  Ax,  =>  0,  equation  (II.l)  becomes 


implying 


I_ 

/o 


lim  exp 

n-*oo,Ax-fO 


Ax,) 


exp  (~  J  />(*)<&)  > 


(H*2) 


Concluding,  let  /  be  the  line  describing  the  path  of  the  X-ray,  and  the  function,  /(x,  y) 
is  the  density  of  the  media  over  l.  Let  ds  denote  a  length  over  the  line  l.  Equation 
(II.2)  may  be  written  in  the  form 

-ln-^  =  Jj{x,y)ds.  (II.3) 

2.  The  Radon  Transform 


This  section  is  an  introduction  to  the  Radon  Transform,  and  elaborates 
its  relation  to  the  data  collected  with  the  X-ray.  We  first  define  the  transform,  then 
briefly  describe  some  of  its  properties.  The  discussion  in  this  section  pertains  to  the 
two-dimensional  case.  That  is,  we  wish  to  find  the  density  of  an  object  defined  over 
some  subset  of  'R.2.  For  generalizations  into  higher  dimensions,  see  [Ref.  3]. 
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We  begin  by  considering  some  density  function  /,  defined  and  bounded 
on  a  simply  connected,  compact  subset  S left2.  Define  L  to  be  the  set  of  all  straight 
lines  passing  through  any  portion  of  fl.  That  is,  L  =  {/  :  /  f)  D  ^  0}.  Note  that 
the  cardinality  of  L  is  uncountably  infinite.  The  Radon  transform  is  defined  by  all 
possible  line  integrals  of  the  form: 

/=  [  f(x, y)  ds ,  j  t  J,  (II.4) 

where  ds  is  an  increment  of  length  along  lj,  and  J  is  the  index  set  of  the  set  L. 

Consider  how  the  lines,  over  which  the  integrals  above  are  computed, 
are  determined.  Let  p  =  [cos  d>,  sin  <^]T.  Then  for  a  fixed  angle  of  rotation  4>  and 
a  distance  p  from  the  origin,  we  may  identify  each  line,  /,  by  the  set  of  vectors, 
x  »  [x,  y]T,  that  satisfy  the  equation 

(x,/x>  =  x  cos  <£  +  y  sin  d>  =  p. 

(See  Figure  2).  Consequently,  we  may  denote  each  of  the  line  integrals  defining  the 
Radon  transform  by 

/(<M  =  /  %  /  (x)  dx.  (II.5) 

Again,  it  is  vital  to  note  that  the  Radon  transform  is  defined  by  the 
collection  of  all  such  line  integrals.  Consequently,  to  determine  the  Radon  transform 
fully,  we  must  know  f(d>,  p)  for  all  values  of  <f>  and  p.  When  we  know  the  value  of  the 
line  integrals  for  only  certain  values  of  4>  and  p,  we  say  that  we  have  a  sample  of  the 
transform. 
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Figure  2.  The  Lint,  L,  as  it  Relates  to  ( p ,  <j>) 


3.  The  Problem  Statement 

We  note  that  the  right  hand  sides  of  Equations  (II.3)  and  (II.4)  are 
identical.  We  conclude,  then,  that  if  the  X-ray  is  sufficiently  narrow,  and  we  are 
able  to  take  an  X-ray  along  all  possible  lines,  the  resultant  infinite  collection  of  data 
corresponds  to  the  Radon  transform  of  the  desired  density  function. 

The  Radon  transform  has  been  shown  to  be  one-to-one  ([Ref.  3]).  That 
is,  when  all  values  of  the  line  integrals  are  known,  one  may  determine  the  unique 
density  that  produces  the  observed  transform  data.  However,  in  most  cases  of  prac¬ 
tical  interest,  we  are  presented  with  but  a  sample  of  the  transform  from  which  to 
reconstruct  an  image.  That  is,  we  are  able  to  collect  only  a  finite  number  of  X-rays. 
Additionally,  the  photon  beam  is  not  sufficiently  narrow  to  be  a  true  line  integral 
defining  the  transform.  In  this  case,  inverting  the  transform  is  an  ill-posed  problem. 
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If  there  exists  one  density  function  whose  sampled  Radon  transform  equals  a  given 
data  set,  then  there  exist  infinitely  many  density  functions,  /  such  that  /  =  b,  where 
b  is  the  data  obtained  from  a  transform  sample.  It  is  this  fact  that  leads  us  to 
investigate  an  optimization  approach  to  the  image  reconstruction  problem. 

D.  OPTIMIZATION 

Each  of  the  examples  can  be  formulated  as  an  optimization  problem.  Fun¬ 
damental  to  any  optimization  problem,  and  to  the  Linear  Optimization  Problem ,  in 
particular,  are  the  concepts  of  feasible  set  and  objective  function. 

1.  Feasible  Sets 

To  help  explain  a  feasible  set,  we  consider  an  example.  Suppose  we 
wish  to  model  the  production  schedule  for  a  baseball  and  softball  manufacturing 
plant.  The  company  is  required  to  make  at  least  500  baseballs  and  1000  softballs 
each  month  to  satisfy  contractual  agreements.  The  company  expects  to  procure  2,000 
pounds  of  stuffing  material,  and  3,000  square  feet  of  leather  covers.  Each  baseball 
requires  ~  pound  of  stuffing,  and  |  square  feet  of  leather.  The  requirements  for  the 
softballs  are  |  pounds  and  |  square  feet  of  stuffing  and  leather  respectively.  Then 
of  all  possible  production  schedules,  we  restrict  our  attention  to  those  that  fulfill 
contractual  requirements  and  do  not  utilize  assets  which  are  not  available.  Let  b  and 
s  be  the  number  of  baseballs  and  softballs,  respectively,  produced  in  a  month.  Then 
we  require  that: 

b  >  500 


12 


5  >  1,000 


\b+ls  <  2,000 

4  o 

lb+js  <  2,000.  (II. 6) 

«  * 

We  have  defined  a  subset  of  all  possible  schedules  by  a  group  of  mathematical  re¬ 
lationships.  In  this  example,  the  feasible  set  is  the  set  of  all  production  schedules 
that  satisfy  the  equations  of  (II.6).  In  general,  we  define  the  feasible  set,  Y,  to  be  the 
collection  of  values  satisfying  the  mathematical  relationships  imposed  by  the  problem. 

2.  The  Objective  Function 

The  objective  function,  g,  defined  over  a  feasible  set,  Y,  is  the  function 
by  which  one  models  the  quality  of  a  solution.  In  the  manufacturing  schedule  example, 
we  might  logically  define  the  objective  function  to  be  profit.  Supposing  that  each 
baseball  contributes  $1  of  profit,  and  each  softball,  $.75,  we  could  write  our  objective 
function: 

g  =  b  +  -75s, 

and  we  seek  the  maximum  value  of  g  over  Y. 

Simply  stated,  an  optimization  problem  is  expressed  by  “Considering 
all  members  of  the  feasible  set,  Y,  which  member(s)  results  in  the  optimum  value  of 
the  objective  function,  gV 
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E.  LINEAR  OPTIMIZATION 


The  Linear  Optimization  Problem,  or  LOP,  is  defined  by  the  criteria  that 
the  objective  function  and  the  relatirnships  defining  the  feasible  set  be  linear  in  our 
decision  variables,  or  the  variables  representing  the  values  we  seek.  Then  we  may 
write  the  LOP  as  follows: 


Let  a  vector  c  =  [cj,  C2, . . . ,  c„]T  e  TV,  a  non-empty  index  set  S,  and  for  every 
s  c  S  a  vector  a(s)  t  TV,  and  a  real  number  b(s)  be  given.  Defining  (u,  v)  as 
the  standard  inner  product,  we  seek  a  vector  y  t  TV,  called  the  optimal  vector , 
that  minimizes: 

<c»y) 


while  satisfying: 


for  all  s  t  S. 


<a(s),y)  >  b(s), 


We  observe  that  a  linear  maximization  problem  may  be  put  into  the  form 
above  in  the  following  way.  The  linearity  of  the  objective  function  assures  us  that  it 
is  continuous  on  Y,  and  that  the  feasible  set  is  compact.  Then  max(/)  =  min(— /), 
and  we  may  equivalently  seek  to  minimize  (  —  f). 

A  similar  change  may  be  made  in  the  constraints  to  reverse  inequalities  if 
necessary.  That  is,  the  problems 

Maximize:  (c,  y) 

Subject  to:  (a(s),y)  <  6(s) 

for  all  s  e  S 
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and 


Minimize:  — (c,y) 

Subject  to:  — (a(s),y)  >  — b(s ) 

for  all  5  e  S 


are  identical. 

1.  The  Linear  Program 

The  case  where  the  cardinality  of  S  =  m  <  oo  defines  a  Linear  Program. 
This  special  case  of  the  LOP  is  of  particular  interest  as  it  forms  the  basis  for  finding 
solutions  to  LOPs  when  the  index  set  S  is  infinite.  Throughout  this  thesis,  the  reader 
may  assume  that  discussion  of  the  general  linear  optimization  problem  permits  the 
possibility  of  an  infinite  index  set  5,  unless  explicity  otherwise  noted. 

Now,  however,  we  examine  this  Linear  Programming  case  to  clarify  the 
concept  of  linearity.  The  problem  becomes 

minimize  (c,y) 
subject  to:  (a(sj),y)  >  6(sj) 
for  *  =  1, 2, . . . ,  m,  over  all  y  c  Hn.  (H-7) 

Let  aj(s,)  denote  the  jth  component  of  the  vector  a(si).  We  may  write  the  problem 
as 


Minimize  Ciyi  +  c^yj  +  •  •  •  +  Cnyn 
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Subject  to:  aj(si)  y,  4-  a2(sj)  y2  +  •  •  •  +  an(s,)  yn  >6(sj) 


ai(s2)yi  +  a2(s2)y2  +  ---  +  an(s2)yn  >  6(s2) 


yi +  fl2{^m)  y2  + - H  an(sm)  yn  >  b(sm) 

over  all  y  c  TV1.  (II. 8) 

We  note  that  in  this  case,  we  may  define  the  feasible  set  by  the  notation 

ATy  >  b  (II. 9) 

with  A  e  7Zn*m,  and  the  ith  column  of  A  is  a(si).  The  ith  component  of  the  vector  b 
is  given  by  h(st).  The  linearity  assumptions  can  be  expressed,  as  follows  [Ref.  7]. 2 

1.  Proportional :  The  objective  function  is  linear  in  the  feasible  set,  Y, 
in  the  following  sense.  Given  a  variable,  y,,  its  contribution  to  the  objective  function 
is  Cjyj.  So  then  a  change  of  d  units  in  y,  results  in  a  change  in  the  objective  function 
value  of  Cjd.  Similarly,  the  constraints  are  linear  with  respect  to  the  variable  yJ5 
insofar  as  the  contribution  of  the  variable  y;  to  the  ith  constraint  :s  aj(si)t/j.  Then 
changing  the  value  of  y,  by  d  units  changes  the  value  of  the  left-hand-side  of  the  ith 
constraint  by  aj(si)d  units. 

2.  Deterministic  :  The  components  of  the  vectors  a(s)  and  c  are  all 
determined,  as  is  each  scalar  6(s).  That  is,  if  the  components  are  derived  from  some 

3  [Ref.  7]  also  identifies  the  qualities  of  additivitiy  and  divisibility  as  requirement  of  the  linear 
optimisation  problem.  These  qualities  are  deemed  to  be  inherent  in  the  qualities  defined  above. 


16 


stochastic  model,  their  variability  is  disregarded,  and  the  numbers  are  fixed  for  a 
given  linear  optimization  problem.  Having  defined  the  Linear  Optimization  problem, 
we  now  turn  our  attention  to  exploring  the  utility  of  solution  techniques  to  non- 
traditional  optimization  problems. 
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Ill 


THE  EXAMPLES,  A  DIFFERENT 
PERSPECTIVE 

A.  OVERVIEW 

This  section  addresses  some  of  the  basic  properties  of  the  sets  from  which  we 
choose  an  optimal  vector  in  our  examples.  It  is  the  structure  which  we  are  able  to 
assign  to  these  sets  that  permits  us  to  exploit  the  theories  regarding  convexity,  and 
subsequently,  the  duality  results  which  we  derive  in  subsequent  chapters.  We  then 
introduce  assumptions  that  refine  the  feasible  sets. 

B.  LINEAR  VECTOR  SPACES 

Before  proceding  to  the  specific  examples,  we  first  turn  our  attention  to  the 
matter  of  linear  vector  spaces.  A  vector  space,  L,  is  called  a  linear  vector  space  if  for 
any  vectors  x,  y,  z  e  L  and  any  real  scalars  a  and  Q  the  following  results  hold  [Ref. 
1]-* 

1.  a(x  +  y)  =  (otx  +  ay)  c  L 

2.  a(/3x)  =  ( a/?)x 

3.  x  +  y  =  y  +  x 

4.  x  +  (y  +  z)  =  (x  +  y)  4  z 

For  each  of  the  example  problems,  the  feasible  set  is  a  subset  of  a  linear  vector 
space.  Consider  the  problem  of  finding  an  orthogonal  polynomial,  p„,  of  order  n.  It 
is  elementary  that  the  set  of  polynomials  of  order  n  form  a  linear  vector  space.  The 
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same  holds  for  the  problem  of  finding  the  polynomial  that  best  approximates  the 
exponential  function  on  [0,3].  Finally,  in  example  three,  we  have  specified  that  we 
wish  to  find  a  density  function,  /,  from  the  set  of  sill  bounded,  piecewise  continuous 
functions  with  support  over  a  compact  set  ft.  The  set  of  all  such  functions  is  a  linear 
vector  space. 

Equally  important  to  our  discussion  is  the  concept  of  a  norm.  In  genera],  a 
norm  on  a  linear  vector  space  L  is  defined  to  be  a  mapping,  denoted  ||  ||  :  L  — y  7l+ 
satisfying  the  following  rules  [Ref.  1].  For  all  x,y  e  L,  and  a  tit, 

1.  ||x||  >  0  and  ||x||  =  0,  <=>  x  =  0 

2.  IMI  =  |  a  |  ||x||, 

3.  ||x  +  y||<||x||  +  ||y||. 

Any  linear  vector  space  equipped  with  such  a  function  is  said  to  be  a  normed  linear 
vector  space.  Each  of  the  feasible  sets  of  the  examples  is  a  subset  of  a  normed  linear 
vector  space.  The  first  two  examples  are  clearly  so.  Any  norm  on  lZn  suffice.  In  the 
third  example,  we  use  the  infinity  norm,  defined  by: 

ll/iloo  =  sup  I  /(w)  | 

wtfi 

as  an  appropriate  norm. 
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C.  REFINING  THE  FEASIBLE  SUBSET  OF  THE  OR¬ 
THOGONAL  BASIS  PROBLEM 

In  the  first  example,  we  are  interested  in  finding  a  polynomial,  p„ ,  of  order  n, 
such  that: 

(Pn,Pi)  =  [  PnPidx  =  0, 

Jo 

for  all  0  <  i  <  n  —  1, 

where  the  result  is  assumed  true  for  all  Pi,Pj, i  ^  j.  That  is,  given  orthogonal  poly¬ 
nomials  po,Pi,  •  ■  •  ,pn-i>  we  seek  a  polynomial  of  order  n,  orthogonal  to  all  of  the 
polynomials  of  lower  order.  We  formulate  this  problem: 

minimize:  jg  pnpidx 

Subject  to:  PnPi  >  0,  for  t  =  1,2, . . .  ,n  —  1.  (III.l) 

Theorem  1.  The  optimal  objective  function  value  for  the  orthogonal  polyno¬ 
mial  problem  is  zero,  and  any  optimal  vector,  pn  satisfies  the  desired  orthogo¬ 
nality  conditions. 

Proof:  Since  we  know  triangular  families  of  orthogonal  polynomials  exist,  we  con¬ 
clude  immediately  that  the  optimal  objective  function  value  is  bounded  above  by  zero. 
The  constraint  gives  us  zero  as  a  lower  bound.  That  any  optimal  vector  satisfies  our 
orthogonality  conditions  is  immediate  from  these  facts.  That  is,  a  zero  objective 
function  value,  in  conjunction  with  the  constraint  ensures  orthogonality.  □ 

There  are  infinitely  many  polynomials  that  satisfy  the  above  criteria.  Specif¬ 
ically,  if  the  objective  function  evaluates  to  zero  for  some  p„,  it  clear  evaluates  to 
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zero  for  a  pn,  for  any  a  (  H.  Consequently,  we  add  the  additional  constraint  that  the 
polynomial  we  desire  is  the  monic  orthogonal  polynomial.  The  additional  constraint 
leads  rather  easily  to  an  n  x  n  linearly  independent  system  of  inequalities,  where  the 
unknown  element  of  TV1  is  the  vector  whose  components  are  the  coefficients  of  the 
desired  polynomial. 

To  illustrate,  let  us  consider  the  specific  cases  of  finding  the  first  order  and  sec¬ 
ond  order  polynomial  satisfying  (III.l).  We  input  the  zero**  order  monic  polynomial, 
po  =  1,  to  start  the  iterative  process. 

In  the  first  order  case,  the  objective  function 

^  f1 

Y,  /  Pn{x)Pi{x)dx 

£oJo 


is  simply 


f 1  1 

/  pi(x)dx  =  /  (z  +  o)  dx  =  -  +  a. 
Jo  Jo  Z 

The  optimization  problem  takes  on  the  form, 


minimize:  5  +  o 
subject  to:  2+0  >  0, 

from  which  we  observe  that  a  =  —  |,  and  conclude  that  pi(x)  =  x  —  5.  While  the 
solution  of  this  particular  problem  is  trivial,  there  are  some  important  conceptual 
principles  working  here.  Considering  the  problem  in  terms  of  the  linear  optimization 
problem,  observe  that  the  feasible  set  is  the  set  of  all  real  numbers  a,  with  a  >  — 
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A»  the  function  we  seek  to  minimize  takes  on  the  form,  C  +  or,  where  C  is  a  fixed 
c  astant,  we  clearly  wish  to  select  the  smallest  possible  value  for  a. 


Similarly,  consider  the  formulation  of  the  problem  of  finding  the  second  order 
polynomial.  We  define  the  polynomials  po  and  pj,  as  above,  and  let  pz  =  x2+ojx+a0- 
Computing  the  integrals,  we  find  that 


f1  [ 1  1 

=  =  3  +  T  +  00’ 


and 


i: 


pin  = 


Jo  (*  ”  5)  i1*  +  QlX  +  a°) 

“  +  (“■  -  5)  +  (“o  -  y)  1  -  y 

1  ,  (<*l  l\,fQ 0  Qi\  <*0 
4  +  V  3  +  V  2  4  J  2 


=  i+H> 
12  12' 


Then  the  linear  optimization  problem 


minimize:  ET^o1  fo  PiPn 
subject  to:  fj  p,p„  >0,  1  =  1, 2, . . . ,  n  -  1, 


becomes: 


minimize:  ^  +  ^Qj  +  a0 
subject  to:  5  +  |qj  +  a0  >0, 

A  +  >  0-  (IH-2) 
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Figure  3.  The  Feasible  Set  of  Example  1;  n=2 


As  we  are  currently  finding  the  feasible  set,  and  viewing  the  problem  in  terms 
of  the  general  formulation,  we  make  the  following  observations.  The  index  set  S  has 
cardinality  2.  By  rearranging  the  constraint  equations,  we  find  the  constraint  vectors 
are 


»(».)  =  [\  i] 

»(>»)  =  [72  °]  • 

with  6(sj)  =  —  i,  and  b{s2)  =  — yj.  As  the  vector  we  seek,  y  =  [c*i  Qo]tc  H2,  we 
may  illustrate  the  feasible  set  as  in  Figure  3. 

We  observe  that  we  have  a  problem  of  finding  the  optimal  vector  in  V?  when 
seeking  the  second  order  orthogonal  polynomial.  This  property  generalizes  for  any 
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order  of  polynomial.  That  is,  if  we  seek  a  polynomial  of  order  n,  we  seek  a  vector, 
y  e  7Zn,  giving  the  coefficients  for  the  optimal  monic  polynomial. 

D.  THE  FEASIBLE  SET  IN  THE  UNIFORM  APPROX¬ 
IMATION  PROBLEM 

Consider  the  problem  of  approximating  the  exponential  function,  c‘,  in  the 
interval  [0, 3]  by  a  linear  combination  of  polynomials.  We  have  specified  that  we  wish 
to  find  the  combination  that  minimizes  the  maximum  residual  over  the  interval,  and 
not  the  total  residual.  Hence,  we  are  not  solving  the  least  squares  problem,  where 
orthogonality  of  the  approximating  functions  dramatically  simplifies  the  task.  With 
the  uniform  approximation  problem,  however,  orthogonality  of  the  polynomials  is  not 
particularly  useful.  Therefore,  rather  than  using  the  orthogonal  polynomials  above, 
we  merely  specify  the  degree  of  the  approximating  polynomial.  Thus  we  seek  a  linear 
combination  of  the  polynomials 

where  P,(t)  =  t%  i  =  0,1,2,  ...n. 

Consider  the  specific  example  for  n  =  1.  We  seek  a  polynomial 

(T,y),  where  T  =  (l,t]T,  and  y  =  [a0,  cti ]Te7£3. 

Since  the  vector  T  is  fixed,  the  problem  is  equivalently  one  of  determining  the  optimal 
vector,  y  c  TV.  We  summarize  with  a  preliminary  statement  of  the  problem. 
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minimize:  m*xtf(0,3]  |  (ET-o  «.<*)  -  «*  I 

over  ail  y  s  [ao, . . . ,  an]T  c  ftn+1. 

Observe  that  the  objective  function  is  non-linear  in  the  decision  variables,  a,,  i  = 
1, . . . ,n.  Also  observe  that  the  feasible  set  is  7Zn+1  in  its  entirety.  That  is,  any 
combination  of  real  coefficients  is  feasible,  since  there  are  currently  no  constraints. 

E.  CONVENTIONS  OF  IMAGE  RECONSTRUCTION 

For  Example  3,  we  have  specified  that  we  wish  to  find  some  function,  /,  defined 
on  the  simply  connected,  compact  set  ft  C  H2.  Assume  that  ft  is  a  circle  of  radius 
1.  We  also  assume  that  the  function  that  we  seek  is  piecewise  continuous  on  ft.  The 
piecewise  continuity  restriction  is  justified  by  the  physical  nature  of  the  problem  we 
are  solving.  We  call  the  space  of  such  functions  F.  Here  it  is  useful  to  define  a  basis 
for  F,  and  we  select  a  logical  basis  in  view  of  the  problem  we  wish  to  solve. 

As  we  have  stated,  the  the  formal  inverse  of  the  Radon  transform  is  well 
defined.  Our  difficulty  results  from  our  inability  to  compute  the  uncountably  infinite 
number  of  line  integrals  defining  the  Radon  transform.  This  difficulty  stems  first 
from  the  fact  that  the  region  over  which  an  X-ray  is  measured  is  not  one-dimensional. 
That  is,  the  region  over  which  the  X-ray  measures  mass  has  both  width  and  length. 
Each  X-ray  measures  the  density  of  the  medium  over  some  strip,  as  in  Figure  4. 
Additionally,  the  number  of  data  points  from  which  one  reconstructs  an  image  is 
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Figure  4.  A  Single  Density  Measuring  Strip 


finite,  rather  than  uncountably  infinite,  as  required  for  formal  transform  inversion. 
A  more  accurate  perspective  from  which  to  view  the  data  obtained  by  the  X-rays  is 
presented  here. 

Begin  by  fixing  an  angle  <j>.  We  associate  with  each  strip  of  (f>,  a  label  (<f>,  i). 
We  introduce  the  strip  characteristic  function,  7.  Define  the  real  valued  function  7 
defined  on  H  by  the  rule 

r 

1,  if  w  lies  in  strip  (<f>, ») 

7#, .•(“>)  =  ‘ 

0,  otherwise. 

Then  an  integral  defining  the  sampled  Radon  transform,  for  a  fixed  angle,  4>,  and  a 
fixed  strip,  (<f>,  t),  becomes 

f*,i  =  jf  /(“>)  7*,i(w)  dw. 
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Figure  5.  A  Single  View.  Note  that  strips  do  not  overlap,  and  cover  ft  completely. 

Let  us  define  a  view  to  be  the  set  of  all  strips  for  some  fixed  angle,  <j>.  We  impose  two 
restrictions.  First,  we  require  that  all  strips  of  a  view  are  non-overlapping.  Mathe¬ 
matically,  if  (<f>,i)  and  correspond  to  two  strips  of  the  same  view, 

7 *,.(“>)  7*i(w)  =  0,  for  all  »  ^  j, 

for  any  weft. 

Second,  we  require  that  the  strips  composing  a  view  completely  cover  the 
compact  set.  That  is,  for  any  weft,  and  every  angle  <£,  there  exist  some  strip  (4>,  i) 
such  that 

7 a»(w)  =  1. 

See  Figure  5  for  a  graphical  presentation  of  these  properties. 
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Assume  that  we  have  some  manner  in  which  to  control  the  width  of  the  strips. 
Then  we  may  select  some  number  of  strips  of  equal  width  for  each  view.  Identifying 
the  number  of  strips  for  view  <f>  as  n^,  and  the  width  of  a  strip  for  view  <p  as  b0,  we 
may  conclude  that: 

n*  x  6*  =  1,  the  diameter  of  Cl. 

For  a  finite  number  of  views,  Nv,  the  application  of  this  convention  partition 
the  set  Cl  into  a  finite  number  of  polygons.  We  call  the  set  of  these  polygons  a 
polygonal  partition  of  (1.  Figure  6  illustrates  the  manner  in  which  these  polygons  are 
formed.  With  each  of  the  resultant  polygons,  Sj,  we  associate  a  scalar,  area(sj),  and 
a  characteristic  function, 

1  if  u  t  Sj ,  and, 

V’jM  = « 

0  otherwise. 

It  is  the  set  of  these  characteristic  functions,  ij>j  that  we  use  as  the  basis  for  the 
function  space,  F. 

Theorem  2.  For  any  continuous  function,  g  defined  on  Cl,  and  any  t  >  0, 
there  exist  some  polygonal  partition  on  n  polygons,  and  some  function 

/  =  £<*0. 

«=i 

such  that  ||/  —  gWoo  <  e. 

Note  that  we  may  write  ||/(w)  —  <7(u;)||0O  with  the  equivalent  notation, 

I/M  -  jMIloo  =  max  {max  |  (/(u>)  -  g(u))^  || . 
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Figure  6.  Polygons  Created  by  the  Views  {t  =  0, 1, . . .  ,4},  Each  xvith  4  Strips  . 

One  may  easily  verify  that  ||/(u>)  —  g(a;)||0o  is  a  norm.  Note  that  we  may  use  the 
maximum  over  j  rather  than  the  supremum,  as  the  polygonal  partition  is  a  finite  set. 
The  properties  of  our  function  space,  F,  allow  us  to  use  the  maximum  rather  than 
the  supremum  over  each  polygon,  Sj. 

Proof:  Let  g  be  any  continuous  function  in  F,  and  let  e  >  0  be  given.  As  g  is 
continuous,  there  exists  some  S  >  0  such  that 

ll(*,y)-(P,g)IU  <  S  implies  that 
\\9(x,y)  -  g{r,q)\\oo  <  £•  (III.3) 

We  use  only  two  angles,  fa  =  0  and  fa  =  Let  n*,  =  n*,  =  [|].  Note  that  this 
implies  that 

rn2 

n  =  n*.  x„*=|-|  , 
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Figure  7.  An  Arbitrary  Square  of  the  Proof  Partition 


as  in  Figure  7.  Further,  for  any  two  points,  (x,y)  and  (p,q)  in  a  fixed  polygon, 

ll(^y)-(p,9)||oo  <&■ 

Let  /  =  X^=i  We  now  consider  ||/  —  y||oo. 


11/  -  Plloo  = 


.max 

[max  1 

(/(w)  -  y(w))0i  || 

max  - 

[max  1 

n 

]£  ajV’iV’i  -  sMV’. 
1=1 

max 

[max  1 

a  jiff j  -  y(u>)^  |  j 

max 

)= 1,2, ...,n 

|  max 

1  <*j  “  |}  • 

Since  g  is  continuous  and  each  of  our  polygons  is  compact,  g  achieves  its  maximum 
and  minimum  on  each  square.  For  each  square,  sj,  define  Mj  =  max^^  y(w),  and 
m}  =  min**,,  y(u;).  Choose 


Using  the  continuity  of  g  to  invoke  the  intermediate  value  theorem,  there  exists  some 
Cj  t  Sj  such  that  g(u>)  =  atj.  Further,  we  know  that  u>  e  sj  =>  ||u>  —  u>||oo  <  8.  Therefore, 
for  any  square,  Sj, 


max  |  g(u))  -  g( u>)  | 

U/tlj 

<  e,  implying 

1  otj  -  g{u)  \ 

<  £,  for  j  =  1,.. 

Therefore 


.  max  {max  |  (/(u>)  -  g{u))rl>j  |1  <  £. 

^  uitij  ) 


□ 


While  the  above  proof  uses  only  two  views,  one  may  increase  the  number  of 
views,  or  insist  on  narrower  strips  in  the  partition  of  f l.  Clearly,  such  a  refinement 
can  not  degrade  the  approximation  of  the  function  g,  but  only  maintain  or  improve 
it.  We  may,  at  worst,  maintain  the  same  constant  values  over  the  new  polygons  that 
they  were  assigned  over  the  coarser  partition. 

We  now  demonstrate  the  utility  of  defining  a  basis  for  F.  Let  k  = 

That  is,  let  k  denote  the  total  number  of  strips  defining  sample  transform.  For  any 
polygonal  partition  P  on  n  polygons,  the  sample  Radon  transform  with  respect  to  P, 
which  we  denote  /p,  may  be  written  as 


fp  =  ATy 
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where  Aisunxl  matrix,  and  y  e  Ha.  The  matrix  A  is  given  by: 


!0,  if  7j(u/)  =  0  for  all  u 

area(sj),  otherwise. 

That  is,  the  Aij  represents  the  area  of  the  itk  polygon  if  the  polygon  falls  within  strip 
j.  The  itk  component  of  y  is  the  mean  density  of  the  function  /  over  polygon  i. 

For  any  fixed  polygonal  partition,  the  feasible  set  is  a  subset  of  the  infinite 
dimensional  vector  space,  F.  Each  element  of  the  subset  may  be  thought  of  as  a 
vector  in  Ttn.  Without  further  restriction,  the  feasible  set  becomes  the  set  of  all 
vectors,  y  e  7ln  such  that  ATy  =  fp.  We  exploit  many  of  the  subsequent  theorems  as 
a  result  of  the  ability  to  translate  the  problem  into  72". 


IV 


CONVEXITY 


A.  OVERVIEW 

In  this  chapter,  we  investigate  the  concept  of  convexity,  both  as  it  pertains 
to  sets  and  to  functions.  The  primary  motivation  for  this  investigation  comes  from 
the  fact  that  we  may,  when  certain  convexity  conditions  are  met,  conclude  that  local 
maxima  and  minima  are  global.  Stated  differently,  we  may  eliminate  a  portion,  often 
a  large  portion,  of  our  feasible  set  from  consideration  when  attempting  to  find  the 
optimal  value  of  our  objective  function.  This  chapter  lays  the  groundwork  for  our 
investigation  into  duality,  contained  in  the  following  chapter. 

This  and  the  following  chapter  form  the  foundation  for  linear  optimization, 
and,  consequently,  the  concepts  and  results  herein  may  be  found  in  most  elementary 
texts  on  the  subject.  The  material  in  this  chapter  is  taken  primarily  from  [Ref.  8] 
and  [Ref.  9],  to  which  the  reader  is  referred  for  further  study. 

B.  CONVEX  SETS 

Let  us  return  briefly  to  the  image  reconstruction  problem.  Consider  two  ar¬ 
bitrary  functions,  f,geF,  the  space  of  bounded,  piecewise  continuous  functions  on 
the  compact  set,  (2.  Select  some  arbitrary  value  for  a  parameter,  A.  We  require  that 
A  e  [0, 1].  Consider  the  function, 

h(u)  =  A/(w)  +  (1  -  A)p(w). 
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First  note  that  u  both  /  and  g  are  defined  on  ft,  so  is  h.  As  /  and  g  are  bounded  on 
a  compact  set,  M  —  max  {sup0  /(w),  supn  g(u)}  is  well  defined.  We  know  that 


h(u)  <  AM  +  (1  —  A)M,  implying  that 
h(u)  <  M,  for  all  u>  Cl. 

Consequently,  the  function,  h  is  in  F.  The  important  items  to  note  here  are  that  /,  <7, 
and  J,  1]  were  each  chosen  arbitrarily.  We  conclude,  then,  for  any  two  elements 
f,gtF  and  for  any  A  e  [0, 1]  the  function, 

h  =  A/  +  (1  -  A  )g  1  F. 

The  above  example  proves  that  the  set  F  is  a  convex  set.  A  set  C  C  I,  a  linear 
vector  space,  is  called  convex  if  for  any  two  elements  y,z  c  C  and  A  e  [0, 1], 

x  =  Ay-f(l-A)z  eC. 

Any  element  y  c  C  of  the  form  y  =  £"=1  A;y;,  with  £"=1  A,  =  1,  0  <  A,  <  1  is 
called  a  convex  combination  of  yi,yj,  •  • .  y„-  This  convex  combination  is  called  strict 
if  Xi  c  (0, 1)  for  all  i.  That  is,  the  convex  combination  is  strict  if  A<  ^  0  or  1,  for  all  t. 
We  now  examine  a  fundamental  characterization  of  convex  sets. 

Theorem  3.  [Ref.  8]  Let  C  be  a  convex  subset  of  L,  an  n-dimensional  linear 
vector  space.  Every  convex  combination  of  the  vectors  of  C  is  an  element  of 
C. 
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Proof:  For  n  =  1,  the  claim  is  trivial.  Assume  that  the  statement  is  true  for  r  <  n  —  1 
where  n  >  1 .  Now  we  consider  some  convex  combination 


*«  it 

y  =  XI  ^y*’  where  yi  c  C,  £  Aj  =  1,  >  0. 


iatl 


i=l 


If  A„  =  1 ,  then  we  are  done,  so  we  suppose  that  An  ^  1.  Define 


A  =  XI 

i=i 


Then 


n— 1 

y  =  a  XI  A(yi  +  A„y„. 

i=l 

Note  that  sum  of  the  first  term  satisfies  the  conditions  of  the  inductive  hypothesis. 
That  is, 

n— 1 

XI K  =  1»  and  A(  >  0. 

is  1 


We  conclude  that 


y  =  A(yi)  c  c,  and 
y  =  Ay  +  Any„. 


Now  consider  the  expression: 


A  +  An 


n-l 

—  ^  A,  +  A n 


1=1 

n 


=  l. 
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Then  by  the  definition  of  a  convex  set,  y  e  C. 


□ 


Let  A  e  Hnxm,  and  let  b  t  Hm.  Then  it  is  elementary  that  the  sets 

G\  =  {x  :  Atx  =  b},  and 
G2  =  {x  :  Atx  >  b}, 

are  convex.  We  prove  the  case  of  G i . 

Proof:  Let  Xi,x2  e  Gj.  Then  Xi,x2  e  Rm,  and  AtXi  =  ATx2  =  b.  We  select  some 
value  for  A  e  [0, 1],  and  consider: 

At(Axj  +  (1  -  A)x2) 

=  AAtXi  +  (1  —  A)Atx2 

=  Ab  +  (1  -  A)b 

=  b.  (IV.l) 

□ 

One  may  show  G2  Is  convex  with  an  identical  argument.  Note  that  the  set,  (?2 
defines  the  feasible  set  of  the  linear  program. 

C.  HYPERPLANES,  POLYHEDRAL  SETS,  AND  EX¬ 
TREMA 

A  hyperplane  H  in  Kn  is  a  set  of  the  form  {y  :  (p,y)  =  k}  where  p  is  a 
nonzero  vector  in  TCn,  and  k  is  a  given  scalar.  It  is  easily  shown  that  the  hyperplane, 
H,  is  a  convex  set.  A  hyperplane  divides  H"  into  two  (non-disjoint)  regions,  called 


half-spaces;  one  is  defined  by  {y  :  (p,y)  >  k}  and  the  other  by  {y  :  (p,y)  <  k, } 
both  of  which  are  again  convex.  Note  that  the  intersection  of  a  finite  number,  m, 
of  half-spaces,  called  a  polyhedral  set,  is  also  convex,  since  the  intersection  may  be 
interpreted  as  {y  :  ATy  >  b}  where  the  ith  half-space  is  define  as  the  set 


{y:<«i,y>  >bi}. 


That  is,  A  is  an  m  x  n  matrix  whose  columns  are  the  vectors  defining  the  half  spaces. 
To  illustrate  this  point,  we  consider  a  simple  example.  Define  the  vectors 


0 

1 

1 

aj  = 

,  *2  = 

,  and  a3  = 

1 

-1 

0 

We  use  the  above  vectors  to  define  the  three  half-planes  in  7Z2, 

<ai,y)  >  -2,  <a2,y)  >  and  <113, y)  >  -i 
Using  the  above  convention  for  identifying  the  matrix  A,  and  the  vector  b,  we  find 


that 


0  1  1 

1  -1  0 


and  b  = 


L  4  J 

We  may  identify  the  intersection  of  the  half-planes  as  the  set  of  vectors,  y  in  TV 
satisfying  the  equation, 


■ 

• 

0 

-1 

1 

2 

At y  >  b  or, 

1 

-1 

y\ 

> 

1 

4 

V2 

-1 

0 

1 

4 
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<83,y>al/4 

Figure  8.  The  Polyhedral  Set  of  the  Example 

The  intersection  of  these  half-planes  is  illustrated  in  Figure  8. 

We  are  interested  in  simplifying  our  optimization  problem  by  eliminating  por¬ 
tions  of  the  feasible  set  from  consideration.  A  critical  tool  in  this  reduction  results 
from  the  notion  of  an  extreme  point.  We  here  define  an  extreme  point,  and  use  the 
concept  to  further  characterize  the  convex  sets  with  which  we  are  working  in  the 
example  problems. 

Let  C  be  a  convex  set.  We  call  y  t  C  an  extreme  point  of  the  set  C  if  it  can 


not  be  represented  as  a  strict  convex  combination  of  the  elements  of  C.  Alternately, 
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y2 


Figure  9.  Extrema  of  the  Example  Feasible  Set 


the  point  y  is  an  extreme  point  if,  and  only  if,  for  any  A  c  (0, 1),  and  for  any  x,  z  c  C, 

y  =  Ax  +  (1  —  A)z  implies  that 
y  =  x  =  z. 

Geometrically,  a  point  y  in  a  polyhedral  set  C  is  an  extreme  point  if  lies  on  some  n 
linearly  independent  defining  hyperplanes  of  C,  where  n  is  the  rank  of  matrix  AT,  as 
formed  above.  Two  extreme  points  are  adjacent  if  the  line  segment  joining  them  is 
an  edge  of  C .  That  is,  the  line  segment  joining  them  is  formed  by  the  intersection  of 
some  n  —  1  linearly  independent  defining  hyperplanes  of  C.  See  Figure  9. 


39 


Theorem  4.  [Ref.  7]  Let  C  be  a  polyhedral  subset  of  72".  IfCis  bounded,  then 
C  has  at  least  n  +  1  linearly  independent  defining  hyperplanes. 

This  theorem  is  offered  without  proof.  However,  its  validity  for  the  case  of 
n  =  2  is  illustrated  in  Figure  9,  where  the  polyhedral  set  in  722  has  three  independent 
defining  hyperplanes.  An  immediate  consequence  of  the  above  is  the  following: 

Theorem  5.  Let  C  be  an  arbitrary  bounded  convex  subset  of  72".  C  has  at 
least  n  extreme  points. 

Proof:  Suppose  that  there  are  fewer  than  n  extreme  points  of  C.  Since  any  n  linearly 
independent  hyperplanes  must  intersect  in  a  single  point  point  in  72",  there  are  fewer 
than  n  + 1  linearly  independent  hyperplanes,  and  C  is  unbounded.  □ 

D.  CONVEX  FUNCTIONS 

We  now  introduce  convex  functions,  and  their  primary  characteristic  with 
which  we  are  interested.  This  introduction  is  cursory  in  nature.  For  a  more  de¬ 
tailed  exploration  of  convexity  with  respect  to  functions,  the  reader  is  referred  to 
[Ref.  8]  and  [Ref.  10]. 

Let  C  C  72"  be  a  convex  set.  A  function  /,  defined  on  C,  is  said  to  be  convex 
if  for  any  elements  x,y  e  C,  and  A  e  [0, 1]: 

/(Ax  +  (1  -  A)y)  <  Af(x)  +  (1  -  A)f(y). 

If  /  is  convex,  then  —  /  is  said  to  be  concave.  Linear  functions  are,  thus,  both  convex 
and  concave.  Having  alluded  to  the  utility  of  convex  functions,  we  state  an  important 
result  formally. 
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Theorem  6.  (Ref.  8]  Let  f  be  a  convex  function  defined  on  a  closed  convex 
set,  C  C  Then  a  relative  minimum  of  f  over  C  is  a  global  minimum. 


Proof:  Let  /  have  a  local  minimum  at  y*,  and  a  global  minimum  at  y2,  with  /(yi )  > 
f(y2).  Let  A  e  (0, 1)  be  given.  Because  /  is  convex 

/(Ay2  +  (1-A)yi)  <  A/(y2)  +  (1  -  A)f(y.).  (IV.2) 

Also,  since  it  is  assumed  that  /(yi)  >  f(y2),  we  conclude 

A/(y2)  +  (l-A)f(yx)  <  A/(yi)  +  (l-A)f(yi) 

=  /( yi).  (IV.3) 

We  now  define  Nt( yi)  —  {y  eHn  :  ||y — yi||  <  e}.  That  is,  we  define  an  e  neighborhood 
about  the  point,  yi.  If 

0  <  A  <  t; - - - jr,  and, 

lly»  -  yall 

y  =  Ay2  +  (1  -  A)yi, 

then  y  e  N,(yi).  Then 

/( y)  =  /(Aya  +  (1  -  A)yi) 

<  A/(y2)  +  (1  —  A)f(yi) 

<  A/(yi)  +  (1  -  A)f(yi) 

=  /( yi)» 

contradicting  IV.3,  and  the  fact  that  /  has  a  local  minimum  at  y2.  We  have  shown, 
then,  that  only  absolute  minima  are  possible.  □ 
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If  the  objective  function  is  convex  (which  it  must  be  since  we  are  considering 
only  linear  objective  functions),  we  can  be  sure  that  we  have  found  an  optimal  vector 
if  it  is  locally  optimal.  This  fact  forms  the  basis  for  the  Simplex  algorithm,  which  we 
explore  in  the  following  chapter. 

Theorem  7.  If  an  optimal  solution  to  the  Linear  Program  exists,  that  is,  if 
min  {/(y)}  exists  and  is  finite  for  some  y  in  the  feasible  set,  C  C  TV1,  then 
there  is  an  optimal  extreme  point. 

Proof:  Let  y  c  C  be  an  optimal  vector,  but  not  an  extreme  point.  Let  a  linear 
objective  function  /  defined  on  the  polyhedral  set  C  be  given.  Since  |  /  |<  oo  at 
an  optimal  vector,  one  may  clearly  add  sufficient  number  of  hyperplanes  to  bound 
the  feasible  set  if  it  is  not  already  bounded,  without  changing  the  optimal  solution. 
Assume  that  /  is  optimal  at  y.  We  consider  two  cases. 

Case  1:  The  vector  y  does  not  lie  on  an  edge  of  C. 

We  first  recognize  that  y  can  be  written  as  a  convex  combination  of  the  extreme 
points  of  C,  since  there  are  at  least  n  linearly  independent  extreme  points.  Let 
E  =  {e  :  e  is  an  extreme  point  of  C).  Let  E  have  cardinality  r.  Then  we  may  write 
y  =  ]C[_i  A|e;.  The  linearity  of  the  objective  function,  /,  implies  /( y)  =  £[=1  Ajf(ej). 
Let  ej  be  some  extreme  point  such  that  /(ej)  >  0,  and  let  us  decrease  the  value  of  A j 
by  S  >  0  units.  We  may  do  so  without  leaving  the  feasible  set  since  we  are  not  on  an 
edge.  Note  that  if  no  such  extreme  point  ej  exists,  then  we  may  increase  the  value  of 
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A j,  and  the  argument  still  holds.  Call  the  new  element  of  the  feasible  set  y'.  Then 

Z(y')  =  + 

=  /( y-^ej) 

=  /(y)-tf(ei) 

<  /( y)» 

implying  that  y  is  not  the  optimal  vector,  a  contradiction.  Hence  if  y  is  a  non-extreme 
optimal  vector,  it  must  lie  on  an  edge  of  C. 

Case  2:  The  vector  y  lies  on  an  edge  of  C. 

Since  y  is  not  an  extreme  point,  but  is  on  an  edge,  it  is  on  the  line  segment 
joining  two  extreme  points,  ei  and  e2  of  C,  and  may  be  written  as  y  s=  Aej  +  ( 1  —  A)e2, 
for  some  A  e  (0, 1).  Parameterize  the  line  segment  between  the  points  ei,e2  by  the 
equation  y(t)  =  tex  +  (1  —  t)e2,  as  <  :  0  — ►  1.  Fix  some  t  e  [0,1],  ,  and  let  y'  = 
(1  -  t)e!  +  te2.  Then 

/(y')-f(y)  =  (1  —  0/(ei)  +  tf(e2)  —  Af(ei)  —  (1  —  A)f(e2) 

=  -(I  -  (1  -  A))/(e0  +  (t  -  (1  -  A))f(e2) 

=  (t-(l-A))  (_/(ei)+f(e2)) 

>  0,  for  all  f,  since  y  is  the  optimal  vector. 

Since  y  is  not  an  extreme  point,  it  can  be  represented  as  a  strict  convex  combi¬ 
nation  of  ei  and  e2.  Therefore,  we  may  choose  some  i  \  (0, 1),  such  that 
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i  —  (1  —  A)  >  0,  implying 
— /(e i)  +  f(®2)  ^  0.  Therefore, 

/(•*)  >  /(• .). 

An  identical  argument  yields  the  result  that  /(e2)  <  f(ei).  We  conclude  that  /(e i)  = 
f(e2),  and  that  any  t  e  [0,1]  results  in  an  optimal  vector.  Choosing  t  =  0,  or  t  =  1 
places  us  at  an  extreme  point.  □ 

An  alternate  proof  may  be  found  in  [Ref.  7]. 

E.  AN  ASIDE:  THE  CONVEX  HULL 

We  desire  to  work  with  convex  subsets  of  linear  vector  spaces,  as  they  have 
useful  characteristics  when  we  attempt  to  solve  more  general  optimization  problems. 
However,  there  is  no  guarantee  that  an  arbitrary  set  is  convex.  For  such  cases,  we 
define  the  convex  hull  of  an  arbitrary  set  A  C  L,  denoted  Conv(A),  as  the  set  of  all 
possible  convex  combinations  of  the  elements  of  A,  where  £  is  a  linear  vector  space. 
An  example  of  a  convex  hull  of  a  non-convex  set  in  H2  is  displayed  in  Figure  10. 

Clearly,  if  A  is  convex,  then  Conv(A)=A.  The  intuitive  notion  that  the  convex 
hull  of  a  set,  A  C  L  is  the  smallest  convex  subset  of  L  in  which  A  is  contained,  and 
conversely,  are  easily  proven  theorems  (See  [Ref.  8]). 

The  real  utility  of  the  convex  hull  stems  from  the  fact  that  any  element  of 
Conv(A)  may  be  written  as  a  convex  combination  of  the  elements  of  A.  Generating 


Extensions  required  to  form  convex  hull 


Figure  10.  Forming  the  Convex  Hull 


the  convex  hull  does  not  add  any  new  extreme  points.  This  is  offered  without  proof. 
The  interested  reader  should  consult  [Ref.  8].  Consequently,  if  we  are  solving  an 
optimization  problem  with  a  linear  objective  function  on  a  non-con  vex  set,  A,  then 
solving  it  over  the  convex  hull  of  the  set  A,  rather  than  over  the  set  A,  itself,  does 
not  change  the  solution  of  the  problem. 
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V 


DUALITY  AND  THE  SIMPLEX 
ALGORITHM 


A.  OVERVIEW 

The  concept  of  duality  makes  it  possible  for  us  to  bound  the  optimal  value  for 
the  objective  function,  /,  and  in  many  cases,  to  solve  the  LOP  more  efficiently.  As 
before,  let  c  be  a  vector  in  7Ln.  Let  S  be  an  arbitrary  index  set.  We  have  previously 
stated  that  for  every  s  e  S,  we  associcate  a  vector  a(s)  in  Ttny  and  a  scalar  b(s).  The 
general  form  of  the  linear  optimization  problem  is: 

minimize:  (c,  y) 

subject  to  (a(s),y)  >  b(s),  for  all  s  e  S 

overall  y  tHn  (V.l) 

We  know  that  we  achieve  an  upper  bound  for  the  optimal  value  of  the  pref¬ 
erence  function  as  soon  as  we  find  an  element  of  the  feasible  set.  However,  we  have 
no  such  simple  criteria  for  determining  a  lower  bound.  Intuitively  the  prospect  of 
finding  some  feasible  vector  is  less  daunting  than  solving  the  problem.  Using  duality 
allows  us  to  form  an  associated  optimization  problem,  find  a  feasible  vector  in  the 
associated  problem,  and  use  the  feasible  vector  to  derive  a  lower  bound  of  the  origi¬ 
nal  problem.  The  associated  optimization  problem  is  called  the  Dual.  In  some  cases, 
we  may  bound  the  original  optimization  from  below  arbitrarily  well  using  the  dual 
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problem.  We  refer  to  the  original  linear  optimization  problem  as  the  primal,  P.  The 
primal,  P,  and  its  associated  dual,  D,  are  referred  to  as  a  Dual  Pair  . 

Define  the  value  of  a  LOP  to  be  the  optimal  objective  function  value.  W'e  seek 
properties  that  allow  us  to  approximate  the  solution  of  a  linear  optimization  problem 
arbitrarily  well,  and  to  determine  when  the  optimal  value  of  the  linear  optimization 
problem  and  its  corresponding  dual  are  the  same. 

This  chapter,  in  conjunction  with  the  previous  chapter,  forms  the  fundamental 
principle  underlying  the  Simplex  algorithm.  The  reader  is  again  referred  to  [Ref.  7] 
and  [Ref.  9]  for  more  detailed  descriptions  of  the  material  of  this  chapter. 

B.  WEAK  DUALITY 

We  begin  with  the  generic  linear  optimization  problem,  (V.5).  The  first  the¬ 
orem  that  allows  us  to  bound  the  problem  from  below  is  stated  here.  Note  that  we 
allow  for  an  infinite  index  set  S. 

Theorem  8.  The  Duality  Lemma  [Ref.  9]  Let  the  finite  subset 

{si,s2,...,s,}  C  S, 

and  the  non-negative  vector 

X  [Xj ,  X2,  •  •  •  ,  Xq] 

be  such  that: 

c  =  a(si)xi  +  a(s2)x2  + - h  a(sq)x<,. 

Then  for  any  feasible  vector  y  =  [yi,y2,...,y„]T  in  the  feasible  set  of  the 

optimization  problem ,  P, 

b[si)xi  +  b{s2)x2  + - 1-  6(s,)x,  <  cTy. 
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Proof:  Since  y  is  a  feasible  vector, 


<a(si),y>  =  a(si)Ty  >  b(sO,  for  i  =  1, 2, . . . ,  q. 


Further,  since  >  0  by  assumption, 


5 Zb(si)xi  <  52  (a(s«)Ty)  * 

»=i 


9 

r 

1=1 


implying, 


^2b(si)xi<  fcx.a^))  y 

is*l  \i=l  / 


T 

=  cy 


As  an  example,  consider  the  problem  of  finding  the  monic  second  order  poly¬ 
nomial,  pa,  orthogonal  to  both,  po  =  1,  and  pi  =  x  —  |.  Recalling  from  Equation 
(III.2),  the  primal  of  this  problem  is 

minimize  +  ao 

subject  to:  5  +  *ai  +  oo  >0, 

H  +  >  0.  (V.2) 

Disregarding  the  constant  in  the  objective  function  does  not  affect  the  choice  of  an 
optimal  vector.  Consequently,  the  optimal  vector  for  (V.2)  and  the  LOP 


mininize:  (c,  y) 

subject  to:  (a(si),y)  >  6(sj) 

(a(s2)»y>  >  bfa)  (V.3) 
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where 


c  = 

are  the  same. 

Attempting  to  satisfy  the  hypothesis  of  the  duality  lemma,  we  seek  a  non¬ 
negative  linear  combination  of  a(si)  and  a(s2)  that  sums  to  c.  That  is,  we  seek  a 
non-negative  solution  to  the  equation: 


7 

12 

,  a(si)  = 

1 

2 

II 

3 

i 

12 

1 

1 

0 

and 


*(*.)--§ 
%*)  =  -A 


• 

1 

2 

1 

12 

Xi 

7 

12 

1 

0 

x2 

1 

Clearly,  the  only  such  vector  satisfying  the  equation  is  the  vector  x  =  [1,  1]T.  Conse¬ 
quently,  the  optimal  value  of  the  primal  problem  of  V.3  can  be  no  better  than 


xj  6(si)-f  x26(s2) 


=  H  )+(-*)— ft- 

Because  the  optimal  vector  of  (V.2)  must  be  the  same  as  that  of  (V.3),  the  value  of 
(V.2)  is  bounded  below  by 


_5_ 

12 


as  expected. 


C.  THE  DUAL 

Having  stated  the  duality  lemma,  we  move  to  a  formal  definition  of  the  dual, 
and  similarly,  the  dual  pair.  We  begin  with  the  special  case  of  a  Linear  Program. 
The  dual  of  a  linear  program 
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minimise:  { c ,  y) 

Subject  to:  ATy  >  b 

is  defined  to  be 

maximize:  (b,  x) 

Subject  to:  Ax  —  c 

with  Xi  >  0,  for  *  =  1,2, . . .  q.  (V.4) 

Note  that  the  dual  of  an  LP  is  an  LP  itself.  To  be  feasible  above,  we  require  a 
non-negative  linear  combination  of  the  constraint  vectors  to  sum  to  the  vector  c.  The 
vector  f  becomes  the  objective  vector  in  the  dual.  These  facts  highlight  the  difficulty 
of  defining  the  dual  of  an  infinite  LOP.  Because  of  the  difficulty  of  computing  infinite 
sums  (possibly  uncountably  infinite),  we  require  a  variation  of  the  dual  for  the  infinite 
case. 

Recall  the  generic  LOP 

minimize:  (c,  y) 

subject  to  (a(s),y)  >  6(s),  for  all  s  c  5 

over  all  ye  Kn.  (V.5) 

As  it  proves  useful  in  the  statement  of  the  dual,  we  write  (V.5)  in  the  alternate  form 

minimize  <vyr 
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subject  to:  E?*i  Or {s)yr  >  b(s) 


for  all  s  e  5.  (V.6) 

The  dual  optimization  problem,  D,  is  defined  to  be: 

Find  &  finite  subset  {si,s2, . . .  ,  s,}  C  S,  and  the  non-negative  numbers,  xi,x2, . . .  ,x,, 
such  that  the  expression: 

Y 

Hi 

is  maximized,  subject  to  the  constraints 

t 

^2xiOr(Si)  =  (V, 

isl 

for  r  =  1,2, (V.7) 

Tha*  is,  the  dual  of  the  infinite  dimensional  LOP  is  to  find  some  optimal 
finite  subset  of  the  index  set,  and  then  solve  the  resulting  LP  dual.  In  keeping  with 
convention,  we  call  the  process  of  taking  a  finite  subset  of  an  infinite  set  discretizing.  It 
is  important  to  note  that  the  dual  is,  in  general,  a  non-linear  problem  in  2 q  variables, 
since  both  the  discretization  and  values  for  the  coefficients,  xt-  are  unknown.  However, 
once  we  have  chosen  a  subset  of  5,  the  problem  is  linear  in  the  unknowns  x,-.  Further, 
one  might  suspect  that  if  a  sequence  of  discretizations  of  S  is  chosen  systematically, 
then  we  may  be  able  to  arrive  at  w  acceptable  approximation  of  the  solutic  of 
the  associated  primal  problem,  assuming  one  exists.  That  is,  we  may  get  arbitrarily 
close  to  the  solution  of  the  dual  problem,  and  consequently,  find  an  arbitrarily  good 
approximation  of  the  solution  to  the  infinite  dimensional  primal  optimizaton  problem 
by  solving  a  sequence  of  Linear  Programs.  This  is  a  basic  premise  behind  solving 
infinite  dimensional  linear  programs  with  the  Simplex  algorithm. 
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D.  APPROXIMATING  THE  EXPONENTIAL  FUNCTION 

The  problem  of  approximating  the  exponential  function  with  an  nth  degree 
polynomial  is  now  analyzed  more  closely.  Of  particuk  interest  is  how  duality  results 
enable  us  to  determine  the  relative  quality  of  a  given  approximation,  and  how  they 
allow  us  to  bound  the  error  in  the  problem. 

1.  The  Primal  Problem 

Recall  that  we  stated  the  problem  of  approximating  the  exponential 
function  over  the  inverval  [0, 3]  as 


Determine  the  polynomial 

/(<)=£>.<■ 

tsO 

that  mininimizes  the  expression 

sup  |  /(f)  -  ef  |  • 
tcfM 

Let  us  formulate  this  problem  in  terms  of  the  standard  linear  optimization  problem. 
We  relabel  the  index  set  T  vice  S  and  define  it  to  be  the  interval  [0, 3].  Realizing  that 
the  objective  function  above  is  a  scalar  valued  function,  as  a  first  step  we  reformulate 
the  problem  as 


minimize:  a„+i 

subject  to:  |  (a,-*’)  —  e*  \  <  an+i,  for  all  t  c  T. 

Eliminating  the  absolute  values,  we  replace  each  constraint  with  the  equivalent  pair 
of  constraints, 

n  n 

-£a«f'  +  e‘  >  -on+j  and,  ]£att’-e*  >  -a„+i. 

tsO  i=0 
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Rewriting,  we  arrive  at 


-£a,*‘  +  an+ j  >  — e‘  and, 

isO 

+  Q„+1  >  e‘. 

i=0 

Thus,  each  element  of  the  index  set  T  has  two  associated  constraint  vectors.  Let 
y  =  [a0, at, , a„+i]T .  We  have,  for  each  t  e  T,  a  vector 

a(t)  =  [t°,  t1,  ...,tn,  l]T, 

and  the  two  constraints 

“<*(*)> y)  £  ~c‘>  and 

<a(t),y)  >  e‘. 

It  proves  useful  in  the  formulation  of  the  dual  problem  to  distinguish  the  two  con¬ 
straints  associated  with  each  t  c  T.  As  a  notational  device  we  identify  the  vectors 


It  is  important  to  note  that  the  use  of  functional  notation  for  the  vectors  a(t+)  and 
a(t~)  is  used  for  convenience  only.  No  such  functional  relationship  exists,  as  there 
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are  two  constraint  vectors  for  each  t  <  [0, 3].  We  distinguish  the  vectors  by  labeling 
two  sets,  T+  and  7*”.  Note  that  T+  =  T~  =  [0,3]. 

Similarly,  for  each  t  c  T,  we  have  the  scalar,  6 (f )  =  e‘,  and  6(f)  =  -e'. 
We  finally  identify  the  objective  function  vector,  c  =  [0,0, . . . ,  1]T  c  TZn+2.  The  final 
formulation  of  the  primal  problem,  P,  is 

minimize:  cTy 

subject  to:  a(t+)Ty  >  6(f+) 

a(t")Ty  >  6(f)  for  all  t  e  T 

overall  y  c  (V.8) 

2.  The  Dual  Problem 

Having  put  the  primal  in  the  desired  form,  we  turn  our  attention  to  the 
dual.  Referring  to  the  general  form  of  the  dual  as  in  (V.7),  we  seek  the  finite  subset 
T  »  {fj,  fa, . . . ,  t,}  C  T,  and  the  vector,  x  c  7£q,  that  maximizes  the  expression 

]£  xMU) 

is  1 

while  satisfying  the  constraints 

9 

^XiOr(ti)  =  Cr,  for  T  =  1, - -  n. 

isl 

First  make  the  substitutions  6(f,)  =  e*',  and  ar(tj)  =  t As  we  have 
defined  the  set  T  to  be  T+  U  T~ ,  the  above  formulation  is  equivelent  to  the  following. 


Find  the  subsets 

r+  =  . ^}cT+, 
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and 

t~  =  . <^}cr 

and  non-negative  scalars  x+, X2  , . . .  x++ ,  and  xj\  xj , . . .  x~_  with  which  to  as¬ 
sociate  each  element  of  the  respective  sets,  that  maximizes 

and  satisfies  the  constraints 

g*  r 

j^xf  (tf)r (<,')'  =  0,  for  r  =  0, 1 , . . . ,  n 

tsl  1=1 

Etf+E*.'  =  i-  (v.9) 

tsl  tsl 

The  formulation  (V.9)  may  be  written  in  the  simpler  form 


maximize:  52l=i  c*‘zt 

subject  to:  J2i=i  —  0,  for  r  =  0, 1, . . . ,  n 

n si  1  i<  1.  (v.io) 


where  U  t  [0, 3]  for  all  *.  The  problems  are  equivalent  in  the  respect  that  one  may 
derive  from  a  feasible  solution  of  one  a  feasible  solution  to  the  other.  The  proof  for 
this  statement  may  be  found  in  [Ref.  9]. 

3.  Qualitative  Analysis  of  Solutions 

We  begin  by  restating  the  duality  lemma  in  the  terms  of  the  uniform 
approximation  problem. 


Theorem  9.  [Ref.  9]  Let  the  finite  subset  T  C  T,  and  the  real  numbers 
xi,xj,...,x9  be  feasible  for  the  dual  problem  of  equation  (V.10).  Then  the 
following  holds  for  any  y  c  7£n+1  : 


£ 


XiC 


1=1 


< 


sup  51  y^r  -  c‘ 

“T  r=0 


(V.11) 
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A*  this  is  a  direct  consequence  of  the  duality  lemma,  it  is  not  proven  here,  though 
the  proof  may  be  found  in  [Ref.  9]. 

Let  us  consider  the  problem  of  approximating  the  exponential  over  T 
with  a  quadratic  polynomial.  Then  from  (V.8),  the  objective  function  vector  c  is 
equal  to  [0,0,0, 1]T.  With  each  t+and  t"  i  T  =  [0,3],  we  associate  the  vectors  and 
scalars 

a(t+)  as  |t+°  t+*  t+J  l]T  and  b(t+)  =  et+, 


and 

a(t~)  =  [— 1~°  —  t“*  — 1“*  lj  and  b(t“)  =  e  * 

respectively.  The  dual  problem,  from  equation  (V.10),  is  to  find  the  set  {tj,  fa,  . . . ,  f,}  = 
T  C  [0, 3]  and  associated  non-negative  scalars  that  maximize 

»=i 


while  satisfying 


y^Xjij  =  0,  for  r  =  0,1,2,  and 

«=i 

EW  ^  1-  (V.12) 

1=1 
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Let  us  arbitrarily  choose  the  subset  T  to  be  {0, 1,2,3}-  Hoping  to  apply  the  restated 
duality,  Theorem  9,  we  first  ret, u ire  a  solution  to  the  equation: 


*  ’ 

• 

X\ 

r 

1111 

0 

x2 

0  12  3 

— 

0 

x3 

0  14  9 

0 

x4 

(V-13) 


Every  such  vector  is  of  the  form  x  =  [—a,  3a,  —3a,  a]T,  where  a  is  an  arbitrary 
real  number.  Scaling  in  order  to  satisfy  the  constraint,  £*=1  |  x,  |<  1,  we  let 
x  =  [— gi  b  ~b  J]T  •  The  hypothesis  of  the  Theorem  9  satisfied,  we  conclude  that 
the  best  quadratic  approximation  to  the  exponential  function  over  T  =  [0, 3]  in  the 
uniform  norm  sense,  differs  from  e*  by  at  least: 

£x,e‘*=  -\e°  +  fe1  -  fe2  +  §e3  w  .6340. 

«=i 


E.  STRONG  DUALITY 

Consider  the  three  different  possibilities  we  may  encounter  in  the  solution  of 
the  Linear  Optimizaton  Problem.  Referring  to  the  optimal  objective  function  value 
of  the  minimization  problem  as  V(P),  and  to  the  optimal  value  of  the  dual  as  V(D), 
we  list  the  possible  conditions,  or  states,  of  the  problem  as  follows  [Ref.  9]: 


Inconsistent:  (IC)  The  feasible  set  is  empty,  so  that  no 
solution  is  possible. 

Bounded:  (B)  There  exist  at  least  one  feasible  vector,  and 

among  such  feasible  vectors,  at  least  one  is  optimal. 
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Unbounded:  (UB)  There  are  feasible  vectors  such  that 

the  objective  function  may  be  made  arbitrarily  small. 

A  duality  gap  is  said  to  occur  when  V(P)  ^  V(D),  that  is,  when  the  optimal 
values  of  the  dual  pair  are  not  the  same.  We  hope  to  find  general  conditions  that 
preclude  the  existence  of  a  duality  gap.  Theorems  that  allow  us  to  disregard  the 
possibility  of  a  duality  gap  are  called  strong  duality  theorems. 

1.  The  Dual  and  Convexity 

We  briefly  characterize  the  dual  problem  as  it  relates  to  our  discussion 
of  set  convexity.  Before  continuing,  we  require  the  definition  of  the  Convex  Cone.  Let 
C  be  a  convex  subset  of  TV*.  The  convex  cone  of  C,  denoted  x(C),  is  defined  to  be 
the  set  of  all  vectors  y  t  TV*,  such  that  y  =  Ax,  where  A  >  0,  and  xeC. 

In  Chapter  IV  we  constructed  an  example  of  a  polyhedral  set  using  the 


vectors 


a(Sl)  = 


0 

-1 

1 

»  a(s2)  = 

,  and  (s3)  = 

1 

1 

0 

The  resultant  polyhedral  set  is  illustrated  in  Figure  8.  The  darkened  region  of  Figure 
1 1  illustrates  the  addition  to  the  set,  that  together  with  the  original  polyhedral  set, 
forms  the  convex  cone.  The  darkened  portions  of  the  Figure  extend  to  infinity. 


Consider  the  specific  case  of  the  convex  cone  of  the  constraints  of  the 
linear  optimization  problem.  We  have  expressed  the  constraints  by  (a(s),y)  >  b(s), 
for  all  s  c  5.  Define  A,  =  {a(s)  :  s  e  5}  C  TV*.  We  know  that  A,  is  convex  from 


<a 


3* 


Figure  11.  Formation  of  the  Convex  Hull. 

equation  (IV.  1).  We  refer  to  the  convex  cone  of  Aa  as  the  moment  cone  of  the 
optimization  problem,  P,  and  denote  x(Aa)  by  Mn- 

Having  defined  the  moment  cone,  we  arrive  a  fundamental  characteri¬ 
zation  of  the  dual  problem,  D. 

Theorem  10.  [Ref.  9]  The  dual  problem,  D,  is  feasible  ( i.e .  the  feasible  set  is 
not  empty)  if  and  only  if  c  e  Mn. 

The  proof  may  be  found  in  [Ref.  9].  The  result  follows  directly  from 
the  definition  of  the  dual.  An  alternate  interpretation  of  this  result  is  as  follows.  The 
dual  problem  is  feasible  if  and  only  if  we  may  express  the  vector  c  as  a  non-negative 
combination  of  the  constraint  vectors  of  the  linear  optimization  problem,  P. 

The  following  is  a  generalization  of  the  theorem  that  allows  us  to  express 
every  element  of  the  convex  set,  A,  as  a  convex  combination  of  the  extreme  points. 
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The  theorem  proves  vital  is  the  discussion  of  the  Simplex  algorithm,  as  it  allows  us 


to  bound  the  required  number  of  elements,  sq  t  S ,  in  the  discretization  of  our  index 
set  when  forming  the  dual. 

Theorem  11.  The  Reduction  Theorem  [Ref.  9}  Let  the  vector  z  e  7Zn  be 
a  non-negative  linear  combination  of  the  vectors,  zuz2, .  ■  ■ ,  zq.  That  is, 

q 

i=l 

with  x,  >  0  for  all  i.  Then  toe  may  also  write : 

q 

z  =  £x[  Zj,  with  x,  U, 

i=l 

where  at  most  n  of  the  numbers  x\  are  nonzero.  Moreover,  the  set  of  vectors 
{Zi}  corresponding  to  the  nonzero  scalars  x\  are  linearly  independent. 

Proof:  We  first  note  that  if  Zi,z2, . . .  ,z<,  are  already  linearly  independent,  then 
q  <  n,  and  the  initial  representation  of  z  already  satisfies  the  theorem.  Assume, 
then,  that  q  >  n,  and,  consequently,  that  the  vectors,  Z!,z2, .  ..,zq  are  not  linearly 
independent.  Then  we  know  that  we  may  write 

=  0, 

isl 

where  at  least  one  a,'  ^  0.  For  any  r  :  ar  ^  0,  we  have: 


*rar 


(V.14) 


Substituting  into  the  equation  of  our  hypothesis,  we  have: 


=  ±U-*A)z-, 

ar; 
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We  have,  then,  a  representation  of  z  by  a  linear  combination  of  q  —  1  of  the  vectors, 
Zj.  We  must  show,  then,  that  the  expression  (x,  —  xr^)  may  be  made  non-negative, 
for  i  =  1,2, ...  ,r  —  1,  r  -j- 1, ...  ,q.  Select  some  ar  >  0.  We  can  clearly  do  so,  as  if 
all  ati  are  negative,  we  may  multiply  by  —1  and  still  have  the  desired  result  that 
—OiZi  =  0.  Then  in  equation  (V.14),  if  a,  <  0,  we  may  conclude  that 


Qr 


since  x,-,  u  and  ar  are  each  nonnegative. 

We  now  consider  the  case  that  a,  >  0.  Then  we  must  show  that  f*-  >  £x-. 
We  may  accomplish  this  quite  simply,  by  selecting  the  r  that  minimizes  the  expression, 

^  over  all  ar  >  0.  We  have  expressed  z  as  a  non-negative  linear  combination  of  q  —  1 
of  the  vectors,  Zj,z2, . . .  ,zq,  and  may  continue  inductively  until  we  have  the  desired 
result.  □ 

The  reduction  theorem  yields  this  immediate  result.  Let  S  =  {sj, . . . ,  s,}  C 
5,  and  the  set  of  non-negative  numbers  {xj, . . . ,  x?}  be  feasible  for  the  dual  problem, 

D.  That  is: 

£x,fl  {Si)  =  Cr, 

t=l 

for  r  =  1, 2, . . . ,  n.  Then  there  is  a  subset,  S'  =  {s„ , . . . ,  s,B }  and  a  set  of  non-negative 
numbers,  {xj  , . . .  ,xjn}  that  is  also  feasible  for  D.  Note  that  we  have  not  included 
the  objective  function  of  the  dual  in  our  reduction  above.  It  is  not  necessarily  true, 
then,  that  we  need  only  to  consider  discretizations  of  5  with  cardinality  n.  That  is, 
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let  us  seduce  the  non- negative  linear  combination 

4 

r  x,a(si),  where  q  >  n 


to  the  combination 

i=i 

where  no  more  than  n  of  the  scalars,  x'  are  non-zero.  Then  it  may  be  that 

Ez*6(5«) 

1=1  1=1 

Consequently,  we  include  the  optimal  objective  function  value  in  the  set  of  equations 
for  reduction.  This  convention  requires  that  we  define  a  new  moment  cone,  which  we 
call,  Mn+1. 

Mn+i  =  X(A') 


where  A!  is  formed  by  the  vectors, 


a'(s)  = 


b(si) 

«i(«0 

On(Si) 


eKn+\  i  =  l,2,...,i 


The  dual,  then,  may  be  stated 


maximize:  cq 

subject  to:  c  c  Afn+1 ,  where  c  =  [co,  cj, . . . ,  c„]T 
This  formulation  is  useful  in  discussion  of  strong  duality  results. 
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2  Solvability  Conditions 


We  move  from  the  infinite  case  to  the  case  of  a  finite  index  set.  The 
following  results  are  presented,  without  formal  proof,  though  they  may  be  found  in 
[Ref.  9]  or  [Ref.  10].  These  theorems  enable  us  to  determine  when  the  dual  problem 
has  a  solution.  That  is,  we  seek  to  determine  when  there  exists  at  least  one  vector  of 
our  feasible  set  that  minimizes  our  objective  function.  Note  the  distinction  between 
solvability  and  boundedness  as  defined  in  the  state  section  above.  That  is,  we  may 
have  feasible  vectors,  but  no  optimal  vector  in  our  feasible  set.  The  discussion  in 
this  section  pertains  to  the  finite  case  of  the  linear  optimization  problem.  Readers 
interested  in  an  examination  of  some  criteria  for  the  convergence  of  the  LOP  in  the 
case  of  an  infinite  index  set  are  referred  to  [Ref.  11]. 

Theorem  12.  [Ref.  9]  Let  the  linear  optimization  problem,  P,  be  such  that 
Mn+i  is  closed,  and  the  dual  problem,  D,  is  bounded.  Then  D  has  a  solution. 

The  proof  of  this  theorem  is  straightforward.  Recognize  that  the  objective  function 
of  the  dual  is  /  :  TV1'*’1  ->  Tl  by  f(zo,Z\,...,zn)  —  zq.  Then  /  is  clearly  continuous, 
on  a  compact  set,  and  we  conclude  the  result. 

Theorem  13.  [Ref.  9]  Any  convex  cone  P  defined  by  a  finite  number  of  vectors 
in  TV*  is  closed,  in  that  any  convergent  sequence  of  vectors  in  P  converges  to 
a  vector  in  P. 

Coupling  these  observations,  we  conclude  that  any  finite  dual  pair, 
(P,  D),  with  both  P  and  D  consistent,  is  solveable.  That  is,  both  the  primal  and 
dual  have  solutions. 
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Figure  12.  The  Separating  Hyperplane  H( y,  u)  of  the  Set,  M,  at  the  Point  z. 

3.  Separating  Hyperplanes 

We  now  address  the  final  tool  that  we  use  to  eliminate  a  duality  gap 
in  the  linear  program.  Let  H(y,u)  =  {x  e  Hp  :  yTx  =  i/}.  Then  the  hyperplane, 
H(y,  v),  is  said  to  separate  z,  a  vector  not  in  Af,  from  the  convex  set,  Af,  if 

yTx  <  v  <  yTz, 

for  all  x  c  M.  Figure  12  illustrates  one  seperating  hyperplane  between  the  point  z 
and  the  set  Af  which  is  contained  in  R3  Let  Zo  be  the  vector  in  Af  closest  to  z  in 
the  Euclidean  norm  sense.  Let  y  =  i  Zo,  and  let  v  =  0.  Then  H(y,  v)  is  the  line 
orthogonal  to  y  at  the  point  z<). 

Theorem  14.  The  Separation  Theorem  [Ref.  9]  Define  ||x||  to  be  standard 
Euclidean  2-norm.  Let  M  C  TV  be  a  non-empty,  closed  convex  set,  and  let  z 
not  be  in  M.  Further,  let  Zq  be  the  unique  vector  in  M  such  that  ||z  —  z0||  < 
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||*  -  x||  for  allx  e  M.1  Finally,  let  y  =  z  -  *o,  and  u  =  {z-  *o)T*o-  Then  the 
hyperplane,  H( y,  u)  separates  z  from  M. 

Proof:  Let  x  e  M,  and  fix  0  <  p  <  1.  Then 


(1  -  p)zo  +  px  =  Zo  +  p(x  -  Zo)  e  M, 


as  M  is  a  convex  set.  Further, 


||z  —  *o||2  <  ||*  -  (zo  +  p(x  -  Zo))||2 

=  ||*  -  *o||2  -  2p(z  -  z<>)T(x  -  Zo)  +  p2||x  -  Zo||2, 


which  implies  that 

(z  -  Zo)T(x  -  Zo)  <  ^||x-Zo||2. 

Let  p  -+  0.  Then 

(z  —  Zo)Tx  <  i/,  for  any  x  c  M. 

Then  by  the  definition  of  a  separating  hyperplane,  we  have  only  to  show  that  u  <  yTz. 
Since  z  is  not  in  M, 

0  <  ||z  —  Zo||2  =  (z  -  Zo)T(z  -  Zo) 

=  yTz  -  yTzo  =  yTz  -  v. 

□ 


The  separating  hyperplane  defined  above  is  a  necessary  tool  in  the  elim¬ 
ination  of  duality  gaps  in  the  finite  linear  optimization  problem. 


’That  such  a  unique  vector  exists  is  proven  in  [Ref.  9]. 


65 


4.  The  Strong  Duality  Theorem 


We  close  this  section  with  a  statement  and  proof  of  a  fundamental 
theorem  of  linear  optimization,  which  states  sufficient  conditions  for  the  absence  of  a 
duality  gap  in  the  dual  pair,  (P,  D). 

Theorem  15.  [Ref.  8]  Let  the  dual  pair,  (P,  D)  satisfy  the  following  assum- 
tions. 

1.  The  dual  problem  is  consistent  and  has  a  finite  value  V(D). 

2.  The  moment  cone,  Afn+i  is  closed. 

Then  (P)  is  consistent,  and  V(P)  =  V(D).  That  is,  no  duality  gap  occurs. 

Proof:  Let  let  the  vector,  c  =  [co,  Ci, . . . ,  c„]T  c  Mn+i,  be  an  optimal  solution  of 
the  dual  problem.  Then,  for  any  e  >  0,  the  vector,  c'  =  [co  +  £,c1,...,c„lT  is 
not  in  Afn+i.  As  we  are  assuming  that  Afn+i  is  closed,  we  conclude  that  there  is 
a  hyperplane  separating  the  vector  c'  from  Mn+i-  Consequently,  there  exists  some 
vector  y  =  [yo.yi,  •  .  • , yn]T  £  W+1,  with  y  /  0,  such  that 

n  n 

5 Zxryr  <  0  <  yo(co  +  e)  +  ]T  <vyr, 

racO  r— 1 

for  x  =  [xo,Xi, . . .  ,x„]T  c  M„+i.  Let  x  =  c.  Then  y0  e  >  0,  implying  yo  >  0.  Now  let, 

x  =  [b(s),ax(s),...,a„(s)]  e  At  C  M„+i. 

where  s  e  S,  and  Or(s)  is  the  rth  component  of  the  constraint  vector  associated  with 
s.  Then 

E-rW  (-—)  ~  b 

r=l  \  yo  J 
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which  implies, 


is  feasible  for  the  primal,  P.  Further, 


e  nn 


0  <  Jfo(c<)  +  £ )  +  53 

rsl 

fz,  v  s«  / 

Applying  the  duality  lemma,  we  conclude  that 


V(P)  <  f^Cryl  <  co  +  £  =  V(D)  +  e  <  V(P)  +  £, 

r=l 

implying 

V(P)  -  £  <  V(Z))  <  V(P), 


for  any  £.  □ 

Of  a  final  note,  if  the  index  set  of  our  constraints  is  finite,  then  we  may 
conclude  immediately  that  no  duality  gap  exists  in  the  dual  pair,  (P,D).  This  follows 
directly  from  the  above  theorem  in  conjunction  with  Theorems  12  and  13. 


F.  THE  SIMPLEX  ALGORITHM 

We  present  a  very  brief  introduction  to  the  Simplex  algorithm,  and  use  it  to 
solve  a  simple  LOP.  This  section  is  not  intended  to  illustrate  the  implementation 
of  the  algorithm  in  any  specific  form.  Rather,  this  section  attempts  to  explain  the 
algorithm  as  it  exploits  the  results  of  the  duality  concepts  above.  The  problem  is 
assumed  to  be  infinite-dimensional  in  this  presentation. 
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We  begin  with  a  problem,  P,  of  the  form: 


Minimize:  TZmi  Crl/r 

subject  to:  ar(s)yT  >  b(s),  for  all  if  5. 

Then  we  write  the  dual,  D: 

Maximize:  6(s,)x, 

subject  to:  £T=i  0,(5,  )*<  =cr,r=l,2,...,n 
s,  c  5,  Zj  >  0. 

Choose  some  subset,  <r  =  {sj ,  sj, . . . ,  sn}  C  5,  and  a  vector  x  =  [xj ,  Xj, . . . ,  xn]T 
that  is  feasible  for  the  dual.  The  methods  to  arrive  at  an  initial  feasible  vector, 
provided  one  exists,  may  be  found  in  any  Linear  Program  text.  In  particular,  the 
reader  is  referred  to  [Ref.  7).  We  derive  a  vector  y  from  our  choice  of  <r,  which  is 
associated  with  the  primal  problem.  As  a  matter  of  convenience,  we  abbreviate  this 
set  of  values,  {<r,x,y}.  We  require  that  the  vectors,  a(sj)  be  linearly  independent. 
That  we  may  always  find  a  set  of  linearly  independent  vectors  is  assumed  in  this 
presentation. 

Forming  our  matrix  A  as  before,  we  know  that  the  linear  independence  of  the 
vectors  ensures  that  there  is  a  unique  vector,  x  satisfying: 

Ax  =  c, 

since  we  are  feasible  in  the  dual.  Define  the  discretized  primal  to  be  the  linear 
program  that  results  in  considering  only  the  finite  subset  of  the  index  set  5.  Let 


A(si, s2, . . .  ,sn)  =  [a(si),a(s2), . . .  ,a's„)],  with  b(si, . . .  ,s„)  defined  in  the  same  - -an- 
ner.  From  the  discretization,  <r,  we  .>ok  for  a  vector,  y,  that  is  feasible  for  tl  lis- 
cretized  primal,  P.  We  note  that  one  such  vector,  y  solves  the  equation: 

AT(Si,S2,...,Sn)y  =  b(S|,S2,...,Sn). 


Then 

y  =  (ATr1b. 

The  set  of  values  of  a  and  the  vector  y  that  is  formed  in  the  manner  above  is  called 
a  basic  solution  of  the  LOP.  The  steps  of  the  algorithm,  to  this  point  are: 

1.  Select  a  subset,  a  C  S,  such  that  the  vectors,  a(sj ),  a(s2), . . . ,  a(sn) 
are  linearly  independent. 

2.  Compute  the  unique  non-negative  solution  to  the  equation,  Ax  -  -. 

3.  Compute  the  solution  to  the  system,  ATy  =  b, 
for  the  discretized  primal. 

Return  to  the  problem  of  approximating  the  exponential  with  a  quadratic  poly¬ 
nomial  er  the  interval  [0,3].  We  have  formulated  the  problem  with  the  constraint 
vectors  of  the  index  sets,  a(t+),  and  a(t”),  given  by 


1 

-1 

t 

-t 

e-f- 

+ 

II 

t2 

,  and  a(t“)  = 

-t2 

1 

1 

Additionally,  the  constraint  scalars  were  defined  to  be  b(t+)  =  e*,  and  b(t~ )  =  —  e‘, 
and  the  objective  function  vector  was  given  by  c  =  [0,0,0, 1]T.  The  problem  is 


minimize: 


subject  to:  a(t+)Ty  >  6(t+) 

a(f)Ty  >  6{r)  for  all  <  e  T 
over  all  ye  71*. 


Step  One:  Arbitrarily  select  a  to  be  composed  of  the  union  of  the  sets  a\  — 
{0,2}  C  T~  and  <r2  =  {1,3}  C  T+. 

Step  Two:  Compute  the  solution  of  the  system 


*  * 

•  ■ 

■ 

-11-11 

X\ 

0 

0  1-23 

X2 

0 

0  1-49 

*3 

0 

11  11 

Xa 

1 

(V.15) 


The  solution  of  this  system  is  given  by 


x  = 


Step  Three:  Compute  the  solution  of 


■  ■ 

■  * 

-10  0  1 

yi 

— e° 

1111 

yi 

e1 

-1  -2  -4  1 

ite 

-e2 

13  9  1 

k 

e3 

(V.16) 
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Figure  13.  A  First  Approximation  of  the  Exponential  Function. 


The  vector  y  =  [1.6342  —  2.2946  2.7445  .6342]T  is  the  unique  solution  of  this  system. 
That  is,  y  is  feasible  for  the  discretized  primal.  The  first  approximation  is  given  by 


P2{x)  =  1.6342  -  2.2946*  +  2.7445x2.  (V.17) 


The  graph  of  the  exponential  versus  the  approximation  is  given  in  Figure  13. 

We  here  introduce  a  lemma  that  offers  us  a  termination  criteria  for  the  algo¬ 


rithm. 


Theorem  16.  The  Complementary  Slackness  Theorem  [Ref.  9]  Let 
the  set,  {<r,  x,  y}  be  as  above.  If  the  vector  y  is  feasible  for  the  non-discretized 
primal  P,  and  the  following  holds: 

if  ®p(^»')yr  =  for  r  =  1,2, . . . ,  n. 

Then  we  may  conclude,  that  if  the  vector,  y,  os  determined  in  step  S,  is  feasible 
for  the  primal,  P,  we  have  found  the  optimal  vector  in  our  problem. 
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Figure  14.  Absolute  Error  in  the  First  Exponential  Approximation. 


In  the  current  approximation  problem,  we  find  that  the  current  solution  does 
not  satisfy  this  criteria.  We  observe  the  graph  of  the  absolute  difference  between  the 
functions  and  find  that  the  error  exceeds  .6342  over  the  latter  portion  of  the  interval. 
See  Figure  14. 

The  remainder  of  the  algorithm  is  a  sequence  of  exchange  steps  that  replace 
existing  elements  of  the  set,  <r,  with  elements  that  improve  the  value  of  the  dual 
problem,  D ,  and  consequently,  improve  the  bound  of  V(P).  The  method  of  selecting 
new  elements  to  the  set,  <r,  may  change  with  implementation,  but  it  should  be  noted 
that  exactly  one  element  of  the  set  a  is  replaced  at  a  given  step,  in  any  implementation. 
Recalling  from  our  discussion  of  extreme  points  of  our  feasible  set,  that  strategy 
ensures  that  the  algorithm  looks  to  adjacent  extreme  points  for  optimality. 

We  conduct  one  such  exchange.  Note  that  the  error  is  most  severe  at  the 
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Figure  15.  Absolute  Error  in  the  Second  Exponential  Approximation. 


point  t  =  3.  Then  it  is  logical  to  seek  a  better  solution  at  that  point. 
tri  =  {0,3}.  The  new  system  of  equations  requiring  a  solution  in  step  3 


- 

■ 

-10  0  1 

yi 

— e° 

1111 

y 2 

e1 

-1  -3  -9  1 

y 3 

— e3 

13  9  1 

k 

e3 

The  solution  of  the  above  system  is  given  by 


Then  we  let 
is 


(V.18) 


y  = 


l  i 
2’2 


We  find  that  the  error  is  decreased.  The  absolute  error  is  given  in  Figure  15. 

We  note  that  the  solution  is  not  feasible  for  the  entire  interval,  since  there 
exist  points  where  the  error  exceeds  .5.  Thus,  we  would  look  to  adjacent  extreme 
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point  solutions  and  repeat  the  process  until  we  arrive  at  a  discretized  solution  that  is 
feasible  throughout  the  interval. 
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VI.  RECONSTRUCTION  FORMULATION 

AND  SOLUTION 

A.  OVERVIEW 

Having  laid  the  complete  foundation,  we  formulate  the  image  reconstruction 
optimization  problem.  The  first  portion  of  this  chapter  addresses  the  conceptual 
aspects  of  the  problem,  while  in  the  latter  portion  we  use  the  Simplex  algorithm  to 
solve  a  simple  reconstruction  problem.  We  conclude  the  chapter  with  a  brief  discussion 
of  the  merits  and  drawbacks  of  a  Linear  Programming  approach  to  the  reconstructs 
problem. 

B.  TARGET  FUNCTIONS  AND  NORMS 

The  problem  we  wish  to  solve  is  to  find  the  density  function,  /,  that  produces 
the  observed  sampled  Radon  transform.  As  the  problem  is  ill-posed,  we  must  define 
some  preference  function  by  which  to  compare  the  quality  of  the  infinitely  many 
density  functions  that  satisfy  the  above  requirement.  We  do  so  by  specifying  some 
function,  g ,  defined  over  fi,  which  is  assumed  to  represent  the  most  likely  density 
of  the  image.  That  is,  of  all  density  functions  that  produce  the  observed  transform 
data,  we  seek  that  which  is  most  like  what  we  expect  to  find.  How  we  determine  the 
function  g  is  not  a  matter  of  discussion  here.  We  only  assume  that  we  know  some 
such  function. 
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The  problem  of  how  to  determine  the  best  solution  becomes  one  of  finding 
the  density  function  that  produces  the  observed  transform  that  is  “closest”  to  g  in 
some  sense.  We  choose  the  infinity  norm,  or  max  norm,  to  measure  closeness.  Let 
P  be  a  polygonal  partition  of  the  compact  set  fi  C  P2,  consisting  of  the  n  polygons, 
«i,  sj, . . . ,  sn.  Recall  that  the  function  0j(o>)  is  defined  to  be  the  characteristic  function 
of  polygon  j  in  P.  Imposing  the  restriction  that  the  optimum  density  be  constant  over 
each  polygon,  the  density  takes  on  the  form, 

/M  =  it, 

i- 1 

We  seek  a  density,  /  defined  over  Cl  that  minimizes  the  following: 

ll/M  -  $M||oo  =  .  max  {max  |  a i  -  g{ u)  || .  (VI.  1) 

We  also  choose  some  e  >  0  and  insist  that 

fp  —  b  <  i,  and 
/  >  0, 

where  the  vector  inequality  is  componentwise.  Recall  that  fp  is  defined  to  be  the 
sampled  transform  of  the  density  /  for  partition,  P.  The  vector  b  is  the  observed 
sample  Radon  transform.  The  non-negativity  constraint  stems  from  the  physical 
nature  of  the  problem.  That  is,  we  do  not  accept  solutions  that  attribute  negative 
density  to  physical  objects. 

Before  continuing,  let  us  consider  the  objective  function  of  equation  (VI.  1). 
Recall  that  our  attention  is  fixed  on  density  functions  defined  on  Cl,  a  compact  subset 


of  Tt2.  Let  us  first  fix  our  attention  on  some  polygon,  sj,  in  the  polygonal  partition  P. 
Let  Mj  denote  the  largest  absolute  difference  between  our  target  function,  g  and  the 
scalar,  atj,  that  we  associate  with  the  polygon,  s:.  That  is,  f(u)  =  a},  for  all  u  t  sr 
The  term 

max  |  aj  -  g{u)  | 

is  well  defined,  as  both  functions  are  piecewise  continuous  over  the  compact  set,  Sj. 
The  objective  function  is  defined  to  be  the  largest  of  the  Mj  values  over  all  polygons. 
As  the  problem  is  not  linear,  we  write  an  equivalent  formulation: 

minimize:  k 

subject  to:  || fp  —  b||oo  <  £ 

ctj  +  k  >  g((jj)r{>j(iij),  for  all  w  c  ft, 

— Qj  +  k  >  —g(u>)ij)j(u>),  for  all  weft, 

aj  >  0,  for  all  j.  (VI.2) 

Suppose  the  target  function,  g,  is  chosen  to  be  continuous  over  ft,  and  further 
suppose  that  gp  =  b,  where  gp  is  the  sample  transform  of  the  target  density,  g. 
If  the  method  is  to  prove  worthwhile,  we  expect  that  the  test  density  function  is 
optimal.  That  is,  if  the  test  and  target  densities  are  the  same,  we  can  expect  to  find 
an  arbitrarily  good  approximation  of  the  test  density.  We  state  the  above  formally. 

Theorem  17.  Let  g  be  a  non-negative,  continuous  function  defined  on  the 
set  ft.  Additionally,  let  values  for  e  >  0  and  £  >  0  be  given.  Then  there  exists 


some  partition,  P,  of  n  polygons,  and  an  associated  function,  f  = 
so  that  the  optimum  value  of  the  linear  optimization  problem: 


minimize :  k 

subject  to:  \\fp  -  gp\\oo 
ctj  +  k 

—atj  +  k 

is  less  than  £. 


<  £ 

(VI.3) 

>  for  all  u  t  ft, 

(V1.4) 

>  for  all  u>  t  ft, 

(VI. 5) 

>  0,  for  all  j 

(V1.6) 

Proof:  We  show  that  we  may  find  a  feasible  vector  for  any  value  of  k,  and  con¬ 
sequently,  for  k  <  e.  The  proof  depends  on  the  continuity  of  the  sample  Radon 
transform.  That  is,  let  g  be  any  continuous  function  defined  on  ft,  and  let  £  >  0  be 
given.  Then  we  wish  to  show  that  there  exists  some  8\  >  0  and  some  partition  P* , 
such  that  the  following  property  holds: 

11/  —  ^lloo  <  ^  ll/p*,  —  9Ptx  lloo  <  e. 

Let  h(u)  =|  f(u>)  —  g(u>)  |  .  Recall  that  for  a  fixed  partition,  a  single  integral  over  a 
strip,  q,  defining  the  sample  transform  takes  on  the  form 

K  =  (j[  Mw)  > 

where  79(u>)  is  the  characteristic  function  of  the  qth  strip.  Let  M  denote  the  area  of 
the  largest  polygon  in  our  partition,  choose  our  8\  to  be  less  than  That  is,  let 
the  functions  /  and  g  differ  by  no  more  than  81  in  the  uniform  norm  sense.  We  have 
already  proven  that  we  may  do  so  for  some  partition.  Then  we  know  for  each  element 
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of  our  sample  data  vector 


=  £  (  /  7,(u>)du;)  > 


<  £. 


(VI. 7) 


As  VI.7  holds  for  each  of  the  finite  number  of  sample  integrals,  we  may  conclude  that 

II  /p»,  ~  SP*,  lloo  <  £• 

Thus,  if  we  can  disregard  constraints,  VI.4,  VI.5,  and  VI.6,  for  any  £  >  0,  we 
may  And  a  partition  Pgl  that  ensures 

11/ -Slice  <  &l 

so  that 

ll/p*,  -  SPi,  lloo  <  i. 

This  implies  that  for  any  value  of  £,  constraint  (VI.3)  is  met. 

Temporarily  disregarding  the  constraint  || fp  —  pp||oo  £  £,  we  have  the  less 
restrictive  optimization  problem: 

mininize:  k 

subject  to:  atj  +  k  >  g(u>)xpj(u}),  for  all  u  e  fl, 

— ctj  +k  >  for  all  u  t  fl, 

cij  >  0,  for  all  j.  (VI.8) 
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That  the  value  of  the  above  optimization  problem  may  be  made  arbitrarily  small 
is  a  direct  result  of  the  fact  that  we  may  represent  any  continuous  function,  g  on 
ft  arbitrarily  well  by  a  function  of  the  desired  form,  per  Theorem  2.  Denote  the 
partition  for  which  the  value  of  k  is  less  than  e  by  P$,.  The  term  S2  is  the  largest 
largest  difference  between  two  points  in  the  same  polygon  of  P.  That  is, 

x,y  t  Sj  c  Pst  =*►  ||x  -  y||oo  <  S2. 

Finally,  choosing  S  to  be  the  min{£i,  £2}*  constraint  VI.3  is  met,  as  are  con¬ 
straints  VI.4  through  VI.6  for  k  <  e.  Then  the  problem  is  feasible.  As  this  is  a  finite 
dimensional  problem,  we  employ  Theorem  13  to  ensure  that  a  solution  to  the  problem 
exists.  Therefore,  the  optimal  value  of  the  optimization  problem  is  less  than  e.  □ 

From  the  above  claim,  we  expect  that  if  our  partition  is  sufficiently  fine,  then 
we  may  reasonably  expect  to  find  an  acceptable  approximation  to  the  solution  of  our 
optimization  problem. 

C.  PROBLEM  STANDARDIZATION 

We  wish  to  understand  the  above  formulation  as  it  relates  to  our  definition  of 
the  general  linear  optimization  problem.  Before  proceeding,  it  is  vital  to  note  that 
we  are  formulating  the  problem  after  we  have  generated  a  polygonal  partition,  P,  of 
ft.  Throughout  this  section,  we  assume  that  P  contains  the  n  polygons,  sj, . . .  ,sn. 
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1.  Inner  Product  Constraints:  Refining  the  Feasible 
Set 

Before  considering  the  constraints  themselves,  recall  that  the  polygonal 
partition,  P,  forms  an  n-dimensional  basis  for  a  subset  of  the  space  of  functions  from 
which  we  select  our  optimal  function.  We  may,  consequently,  think  of  any  density 
function  /  as  a  vector  y  t  72.",  where  the  jth  component  of  y  is  the  scalar  value  of  the 
density  on  polygon,  Sj.  For  reasons  that  become  clear  shortly,  we  augment  the  vector 
of  decision  variables  to  be  y  =  [oj,  q2,  . . . ,  o„,  k]T  t  ft"*1. 

We  divide  our  constraints  into  three  distinct  classes: 

1)  Strip  based  constraints, 

2)  Polygon  based  constraints,  and 

3)  fl  based  constraints. 

First  consider  the  strip  based  constraints.  We  require  that  the  sample 
transform  of  the  optimal  objective  density  be  within  a  specified  tolerance  of  the 
observed  sample  transform.  The  constraints  were  identified  in  the  previous  section 
by  the  equation 

ll/p-blloc  <  £. 

Eliminating  the  norm  above  results  in  the  two  constraints 

-fp  >  -b-£,  (VI. 9) 

and 

Jp  >  b  —  £.  (VI.  10) 
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Let  m  denote  the  number  of  strips  used  to  generate  the  partition.  P. 
Let  Q  =  {91,92, ...  ,9m}  be  the  set  of  such  strips.  Then  for  each  q  e  Q,  we  require 
that  the  jth  component  of  our  constraint  vector  be  determined  by  the  following  rule: 

aj(qi)  =  area( Sj)  mat  {^qM  0jM}  . 

Of  course,  this  convention  is  the  same  as  that  of  Chapter  III.  The  jih  component  of 
the  vector,  a(qi)  is  the  area  of  the  jtk  polygon  if  the  polygon  falls  within  strip  9,,  and 
zero  otherwise.  As  before,  we  wish  to  consider  the  two  constraints  associated  with 
each  strip  separately,  and  define  Q~  and  Q+  to  index  constraints  (VI.9)  and  (VI.  10) 
respectively. 

The  right  hand  side  of  each  constraint  is  also  determined  as  in  Chapter 
III.  That  is,  6(9—)  and  6(9+)  as  in  equations  (VI.9)  and  (VI.10),  where  bx  is  the  data 
from  strip  9,-  of  our  sample  transform.  We  append  a  zero  to  each  strip  based  constraint 
vector,  as  each  is  independent  of  the  value  k. 

The  polygon  based  constraints  are  found  entirely  in  the  requirement 
that  our  density  function  be  non-negative.  In  the  initial  formulation,  the  requirement 
was  written 


Qj  >  0,  for  all  j. 

That  is,  we  require  that  the  density  assigned  to  each  polygon  in  the  optimal  vector 
be  non-negative.  Let  P  =  {sj, S2, . . . , sn}  be  the  fixed  partition.  Then  the  constraint 
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vector  associated  with  each  polygon  is  form*  y  the  simple  rule: 


aj(si)  =  max{0j-(u>)  0<(w)} , 

1.  if  *  =  3 
0,  otherwise, 

for  i,j  =  1,2 

We  append  a  zero  to  each  a(s;)  as  in  the  case  of  the  strip  based  con¬ 
straints,  as  we  are  selecting  a  vector  from  'R.n+1.  Clearly,  6(s,)  =  0,  for  all  i.  Then  the 
vector  form  of  each  polygon  based  constraint  is: 

<a(si),y)  >  0,  for  i  =  l,2,...,n. 


As  our  third  class  of  constraints  corresponds  to  the  set  ft,  we  may 
correctly  infer  that  the  final  index  sets  are  infinite.  These  index  sets  provide  the 
constraints  that  facilitate  a  comparison  of  solution  quality.  There  are,  in  fact,  two 
such  index  sets,  ft+  and  ft-,  as  we  again  eliminate  the  explicit  use  of  the  infinity 
norm  from  the  formulation.  In  the  initial  problem  statement,  these  constraints  were 
written 


otj  +  k  >  g(w)  V’j(w), 

— otj  +  k  >  —  g{u)  ij>j(u),  for  all  u  i  ft. 
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We  focus  only  on  the  former,  as  Q~  is  formulated  in  a  nearly  identical 
manner,  and  the  process  has  been  executed  in  the  strip  based  constraints.  We  desire 
constraints  of  the  form  (a(u>+),y)  >  b(u>+),  for  all  u>+  c  Sl+.  Let 


aj(u+)  *^(u+),  for  j  =  1,2 ,...,«,  u>+  e  Q+, 
aw+j(w/')  =  1,  for  all  u>+  c  fi+. 

The  associated  b(u>+)  is  defined  to  be  g(ui+).  The  constraints  associated  with  the  set 
are  formed  in  exactly  the  same  manner,  with  sign  changes  as  appropriate. 

Concluding,  we  define  our  index  set  T  —  Q+  U  Q~  U  P  U  fi+  U  Q~. 

D.  THE  IMAGE  RECONSTRUCTION  DUAL 

The  image  reconstruction  optimization  problem,  as  we  have  formed  it,  is  a 
specific  example  of  the  uniform  approximation  problem.  Consequently,  we  find  some 
strong  similarities  in  its  dual  problem  to  the  dual  of  the  approximation  of  the  ex¬ 
ponential  function.  Let  us  derive  the  dual,  D,  of  our  image  reconstruction  problem. 
Note  that  this  section  is  included  in  the  interest  of  completeness.  The  material  herein 
is  complicated  and  is  not  especially  enlightening.  The  reader  may  wish  to  skip  this 
section. 

We  seek  a  subset,  {^,<3, . . .  ,t,}  C  T,  and  the  non-negative  vector  x  = 
[xj,  x2, . . . ,  x<,]T  that  maximize  the  equation 

(b,x> 
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and  satisfies 


(x,a(t))  =  c. 

We  address  the  selection  of  the  subset  first.  Consider  the  strip-based  con¬ 
straints,  associated  with  Q  C  T.  Recall  that  Q  =  Q+  U  Q~.  We  seek  some  subset  of 
each  of  these  sets.  We  denote  these  subsets  Q+,  and  Q~.  With  each  element  of  each 
subset,  we  associate  a  non-negative  real  number,  x(q+),  for  j  =  1, . . . ,  n?-+ ,  and  x(q~), 
for  j  =  1, . . .  ,n,— . 

Considering  the  index  set,  P,  with  which  we  associated  the  polygon  based 
constraints,  we  seek  some  non-negative  value  x(sj)  to  associate  with  each  constraint 

A 

of  a  subset  P  C  P.  Following  the  above  convention,  we  let  j  =  1, . . . ,  np. 

Let  us  move  to  the  infinite  index  sets,  ft+  and  ft-.  As  we  noted  above,  there 
are  two  classes  of  constraint  vectors  associated  with  our  index  sets,  ft+  and  ft-.  In 
particular, 


rpi(u+) 

-V>i(<*>  ) 

V>2(^+) 

1 

,3, 

■$- 

1 

: 

and 

: 

0n(^+) 

i 

J, 

4 

i 

1 

l 

For  each  of  our  index  sets,  f 2+  and  ft  ,  we  seek  some  discretization 
ft+  =  {w1+,w+,...,w+ft+},  and 

A 

ft-  =  {wf  . .  .,w^_  },  as  well  as  non-negative  scalars, 
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x(w1+),x(w^),...,z(w+ft+),  and 


Then  the  dual  D  is  to  find  the  above  discretization  and  non-negative  x  values 
that  maximize  the  expression: 

n<}+  n(j-  np  nft+  nft- 

£  x{qt)K<i)  -  £  *(9f)6(9f)  +  £*(*)&(*)  +  £  x{u?)b(u?)  -  £  x(u}~)b{u~), 

t=i  »=i  i=i  t=i  «=i 

while  satisfying  the  constraints: 

n<3+  "<}-  nP  nft+  "rt- 

£  *(9*K(9* )“£  *(«f  )Mgf  )+£*(s.Ms.)+£  x(uf  )ar(u>*)—  £  x(w~ )ar(u>- ) 

t=l  1=1  1=1  i=l  1=1 

for  r  =  1, 2, . . . ,  n,  and 


"0+ 


n<J- 


£  *(<lt)*n+l(qt)  -  £  *(9r)«n+l(<?f) 

t=l  i=l 


nft+ 

+  £*(a«K+l(*)  +  £  X(U?)an+l(Ut) 


i=l 


i=l 


nft_ 

-S^rK+iK')  = 1 

t«sl 


*(9*)>0,  for 

x(q~)>0,  for  «sl,2,...,n0_, 
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x(s.)  >  o, 

for 

i  1,2,...,  up, 

x(v?)  >  0, 

for 

i  —  1,2,...,  ,  and, 

x{u)~)  >  0, 

for 

t  —  1,2,...,  • 

While  the  above  formulation  of  the  dual  is  intimidating,  we  may  simplify  im¬ 
mediately  by  recognizing  some  features  of  the  constraints  of  our  primal  problem.  We 
know  that  the  scalar,  6(s,)  =  0,  for  all  s*.  Then  the  middle  term  in  the  objective 
function  disappears  completely. 

Let  us  move  to  the  first  constraint.  The  middle  sum  also  collapses  to  the  single 
term  x(sr),  as  we  have  defined  ar(sj)  to  be  the  Kronecker  The  first  three 

terms  of  the  second  constraint  disappear  altogether,  as  we  have  specified,  an+1(sj)  = 
an+i(s_,)  =  0,  for  all  j.  The  non-negativity  constraints  remain  the  same. 

E.  A  SAMPLE  SOLUTION 

We  now  use  the  formulation  of  the  image  reconstruction  problem  as  a  linear 
program  to  solve  a  simple  problem.  We  first  discuss  the  geometry  of  the  partition 
that  we  are  using,  and  then  identify  some  additional  simplifying  assumptions  that 
make  the  problem  more  tractable.  We  introduce  the  expected  density  of  our  sample 
problem,  and  conclude  with  the  Simplex  solution  of  the  problem. 

1.  The  Partition 

The  partition  that  we  use  in  this  example  is  illustrated  in  Figure  16, 
where  the  color  of  a  polygon  is  a  function  of  its  area.  Larger  values  correspond  to 
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Figure  16.  The  Partition  of  the  Sample  Problem 


lighter  colors.  We  have  chosen  the  four  angles  0,  and  with  five  strips  for  each 
angle.  The  resulting  partition  consists  of  89  polygons.1  It  should  be  noted  that  each 
strip  has  width  of  Consequently,  as  views  at  angles  of  ^  and  ^  require  more  than 
5  strips  to  cover  the  unit  square  completely,  only  the  portion  of  the  square  that  falls 
in  the  five  center  strips  is  considered.  The  rest  is  ommitted. 

2.  A  Simplifying  Assumption 

Rather  than  attempt  to  solve  the  infinite  dimensional  problem  as  de¬ 
rived  in  the  initial  portion  of  this  chapter,  we  project  the  target  density  onto  the 
n-dimensional  polygonal  basis  of  our  partition.  That  is,  we  insist  that  the  target 

'The  manner  in  which  the  polygons  were  identified  and  the  areas  of  each  polygon  computed  is 
not  a  matter  of  particular  concern  here.  It  is  sufficient  to  state  that  the  symmetry  of  the  partition 
was  deeply  exploited  in  a  manner  which  simplifies  the  problem  of  polygon  identification  and  area 
computation  over  89  polygons  to  one  of  many  fewer  than  89. 
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function  be  constant  on  each  of  the  polygons  of  the  partition.  This  simplification 
reduces  the  infinite  index  sets  nd  ft-  to  finite  sets,  as  we  need  only  consider  a 
representative  Uj  e  Sj  for  each  polygon  s:  when  determining  the  norm  of  the  differ¬ 
ence  between  our  target  function  and  optimal  function.  Without  this  assumption,  the 
problem  is  very  similar  to  the  infinite  problem  of  approximating  the  exponential  with 
polynomials,  which  was  discussed  in  more  detail  when  the  Simplex  algorithm  was 
introduced.  It  is  possible  that  this  problem  oL.nble  without  this  simplification, 
but  no  attempt  is  made  to  solve  it  in  this  thesis. 

We  choose,  in  projecting  the  target  function  onto  our  finite  dimensional 
space,  the  density  of  the  function  over  each  polygon  divided  by  the  area  of  the  polygon. 
That  is,  after  the  target  function  g  is  projected  into  the  finite  space,  it  takes  on  +ne 
form: 

n 

9  = 

j=i 

where 


That  is,  (3  =  the  mass  of  g  over  polygon  Sj,  divided  by  the  area  of  polygon  Sj. 

3.  The  Target  Function 

We  now  identify  the  target  function  of  the  sample  problem.  We  use  the 
simple  function, 
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Figure  17.  The  Projected  Target  Function 


for  x  e  [0, 1]  and  y  e  [0, 1].  The  projection  of  the  density  function  is  illustrated  in 
Figure  17.  The  particular  data  for  the  constant  densities  assigned  to  each  of  the 
strips  represent  the  values  which  we  hope,  or  expect,  to  find  in  the  solution  of  our 
problem,  before  considering  the  data.  That  is,  the  values  are  assumed  to  represent 
the  most  likely  solution  to  our  problem. 

4.  The  Test  Density 

The  density  function  that  we  use  to  generate  the  test  data  is  given  by 
the  expression 


h(x,y)  = 


The  density  function  is  displayed  in  Figure  18. 
defining  integrals  become  the  right  hand  side  of 


for  (x  -  <  £ 

otherwise. 

The  values  of  sample  transform 
our  equality  constraints  when  we 
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Figure  18.  The  Test  Density  Function 


formulate  the  problem. 

The  manner  in  which  we  have  defined  the  projection  of  a  density  onto 
the  finite  dimesional  space  assures  us  that  both  the  test  density  h  and  its  projection 
have  the  same  sampled  transform.  Thus,  barring  catastrophic  rounding  error,  the 
formulation  is  always  feasible.  That  is,  there  must  be  some  density  function  that 
produces  the  sampled  transform,  even  after  we  have  projected  the  test  density  onto 
the  partition.  If  the  sampled  transform  is  uniquely  determined,  we  reconstruct  the 
projection  perfectly,  though  the  value  of  the  variable  k  may  be  quite  large. 

As  a  basis  of  comparison,  we  note  that  the  maximum  difference  between 
the  projection  of  the  target  density  and  the  test  density  is  given  by  d  =  .1949.  We 
may  certainly  expect  then,  that  the  optimal  density  varies  by  no  more  than  the  above 
value  of  d. 
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Figure  19.  The  Simplex  Solution  of  the  Sample  Problem 

The  optimal  density  as  determined  by  the  Simplex  algoritm  is  displayed 
in  Figure  19.  The  value  achieved  for  the  maximum  absolute  deviation  between  the 
target  density  and  the  optimal  density  is  d!  =  .1577.  We  consider  the  difference  over 
each  polygon  in  Figure  20. 

F.  SUITABILITY  OF  SIMPLEX  IN  IMAGE  RECON¬ 
STRUCTION 

We  briefly  consider  the  merit  of  using  the  Simplex  algorithm  to  solve  the  image 
reconstruction  problem.  That  is,  we  wish  to  consider  how  well  the  tool  we  have  chosen 
fits  our  particular  job. 

The  results  of  this  particular  example  show  the  tendency  of  this  formulation 
to  spread  error  over  the  entire  region.  This  consequence,  it  is  believed,  results  from 
the  use  of  the  infinity  norm.  We  may  also  question  the  choice  of  target  functions, 
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Figure  20.  Difference  over  Each  Polygon 

and  may  look  to  other  methods  of  qualification.  However,  the  optimization  problem 
achieves  what  it  is  designed  to  achieve.  That  is,  we  have  found  the  density  that 
satisfies  the  minimum  deviation  in  the  uniform  norm  sense. 

We  are  also  forced  to  consider  the  substantial  data  that  are  required  to  solve  the 
problem.  The  problem  of  polygon  identification  is  a  difficult  one  by  itself,  especially 
in  view  of  the  fact  that  a  typical  partition  for  the  CAT  scan  problem  is  generated  by 
200-300  angles  with  up  to  500  strips  per  angle.  With  this  geometry,  we  know  that  the 
number  of  polygons  exceeds  4,000,000.  Further,  we  require  the  area  of  each  polygon 
be  known  to  solve  the  problem  as  we  have  formulated  it.  Finally,  as  each  polygon 
gives  rise  to  a  variable  in  our  primal  problem,  we  are  solving  a  problem  in  a  subspace 
of  %n  where  n  is  quite  large.  On  the  positive  side,  we  know  that  we  must  solve  the 
polygon  identification  and  polygonal  area  problems  only  once.  Further,  the  matrix. 
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A  which  results  from  the  formulation  above  is  extremely  sparse,  which  may  lead  to  a 
more  rapid  solution  of  the  Simplex  problem,  or  invite  other  methods  of  solving  Linear 
Programs. 

In  conclusion,  the  author  contends  that  the  Simplex  algorithm  fits  well  con¬ 
ceptually,  but  may  not  suited  for  the  vast  ness  of  the  problem  as  it  is  formulated  here. 
Projecting  the  density  functions  onto  the  polygonal  partion  is  conceptually  identical 
to  selecting  finite  subsets  of  an  infinite  index  set.  The  theorems  presented  in  regard  to 
the  image  reconstruction  problem  indicate  that  we  may  solve  the  infinite  dimensional 
problem  through  a  sequence  of  finite  dimensional  problems,  when  certain  conditions 
are  met. 

Some  alternatives  that  might  warrant  future  consideration  are  norms  other 
than  the  infinity  norm,  or  using  the  Simplex  model  to  refine  existing  solutions  to  the 
reimaging  problem,  where  the  number  of  variables  is  less  restrictive. 
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VII 


CONCLUSION 


In  conclusion,  we  have  introduced  the  Simplex  algorithm  in  a  context  quite 
apart  from  its  usual  applications.  The  principal  vehicles  for  the  exploration  of  the 
algorithm  were  three  unique  applications,  each  illuminating  distinct  features  of  the 
theory  underlying  the  implementation  of  the  Simplex  algorithm. 

In  particular,  we  began  with  a  problem  of  finding  orthogonal  monic  polynomi¬ 
als  over  a  closed  interval.  This  example  led  to  a  very  basic  Simplex  formulation,  and 
was  solved  as  a  finite  dimensional  problem.  The  requirement  that  the  polynomials 
be  monic  facilitated  the  relatively  simple  formulation.  Follow  on  problems  to  this 
example  might  be  the  adaptation  of  the  algorithm  to  generate  an  non-polynomial 
orthogonal  basis  for  an  infinite  dimensional  function  space,  or  perhaps  to  fit  the  al¬ 
gorithm  to  solving  the  non-linear  orthonormal  basis  generation  problem. 

Second,  we  formulated  the  problem  of  approximating  a  function  over  a  closed 
interval  in  the  uniform  norm  sense.  Unlike  the  first  example,  the  problem  was  infinite 
dimensional,  in  that  the  formuation  required  a  constraint  for  every  number  in  the 
uncountably  infinite  set.  This  problem  proved  particularly  helpful  in  illustrating  the 
principle  of  weak  duality,  and  ultimately  illustrated  the  Simplex  algorithm  itself.  The 
special  qualities  of  polynomial  approximation  were  ommitted,  though  the  reader  is 
referred  to  (Ref.  9]  for  a  more  complete  discussion  thereof.  Again,  potential  areas  for 


future  research  might  include  approximation  with  functions  other  than  polynomials. 

Each  of  the  above  examples  were  used  extensively  to  illustrate  the  highlights  of 
convexity  and  duality,  upon  which  the  Simplex  algorithm  is  based.  The  treatment  was 
relatively  general,  though  many  of  the  theorems  required  that  the  linear  optimization 
problem  be  finite.  Work  is  underway  to  identify  classes  of  infinite  dimensional  prob¬ 
lems  which  may  be  solved  by  a  sequence  of  finite  dimensional  problems.  The  reader 
is  referred  to  [Ref.  11]  for  more  complete  discussion  of  this  active  area  of  research. 
Highlights  include  infinite  horizon  planning,  fuzzy  set  semi-infinite  programming,  and 
linear  programming  in  control  theory. 

Another  area  of  focus  in  this  paper  was  on  the  Image  Reconstruction  problem. 
Again,  this  is  an  area  of  active  research.  After  presenting  the  requisite  background, 
we  formulated  this  problem  as  an  infinite  dimensional  linear  optimization  problem, 
and  as  a  special  case  of  the  uniform  approximation  problem.  Results  were  presented 
that  indicated  that  use  of  Simplex  to  solve  a  sequence  of  linear  programs  is  conceptu¬ 
ally  sound,  though  not  necessarily  practical  in  view  of  the  size  of  the  problem.  This 
-^plication  of  the  algorithm,  however,  is  open  to  more  extensive  research  in  a  number 
of  areas.  A  different  choice  of  norms  by  which  the  quality  of  density  functions  is 
measured  may  eliminate  a  number  of  constraints.  A  technique  for  formulating  opti¬ 
mization  problems  with  the  2-norm  is  found  in  [Ref.  12],  and  may  prove  useful  in 
this  application.  The  Simplex  algorithm  may  also  provide  an  inexpensive  method  to 


solve  coarser  problems,  from  which  one  may  determine  the  necessity  of  constructing 
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more  detailed  models.  Alternately,  there  may  be  some  utility  in  using  the  algorithm 
to  solve  the  reconstruction  problem  in  only  small  portions  of  the  set  over  which  a 
density  is  defined.  If  there  is  utility  in  such  an  application,  the  logical  consequence  is 
research  of  parallel  Simplex  implementation. 

The  potential  utility  of  the  Simplex  algorithm  to  unconventional  applications 
seems  clear.  Even  when  actual  implementation  of  the  algorithm  is  not  practical,  the 
tools  of  convexity  and  duality  apply  to  broader  areas  of  optimization. 
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